DOCUMENT RESUME 



IR 055 439 

Michel son, Avra; Rothenberg, Jeff 

Scholarly Communication and Information Technology: 
Exploring the Impact of Changes in the Research 
Process on Archives. Rand Reprints. 
Rand Corp., Santa Monica, Calif. 
RAND/RP-187 
93 
85p. 

Reports - Evaluative/Feasibility (142) — Journal 
Articles (080) 

American Archivist; v55 n2 p236~315 1992 
MF01/PC04 Plus Postage. 

Access to Information; ^Archives; Artificial 
Intelligence; Change; ^Communication (Thought 
Transfer); Curriculum Development; Electronic 
Publishing; *Futures (of Society); Human Resources; 
Hypermedia; Information Dissemination; Information 
Policy; ^Information Technology; Professional 
Development ; "Research Methodology; Strategi c 
Planning; "Trend Analysis; Users (Information) 
IDENTIFIERS Connectivity; End User Statistical Computing; Virtual 

Reality 

ABSTRACT 

The report cons iders the interact ion of trends in 
information technology and trends in research practices and the 
policy implications for archives. The information is divided into 4 
sections. The first section, an "Overview of Information Technology 
Trends," discusses end-user computing, which includes ubiquitous 
computing, end-user interfaces, and "online transition." Connectivity 
is examined in terms of access to computational and human resources; 
Che trend toward interchange standards; and distributed versus 
centralized control. The following technology trends affecting 
scholarly communication are examined: artificial intelligence; 
end-user publication and distribution; hypertext and hypermedia; 
visualization and virtual reality; and caveats. The second section, 
"Scholarly Communication and the Use of Current Information 
Technology," discusses a number of issues. Identification of sources 
and communication with colleagues is covered. Interpretation and 
analysis of sources is discussed, including computer-assisted 
analysis through conversion and computer-as s i s ted analysis with 
artificial intelligence. Electronic publishing and hypermedia are 
identified as findings for dissemination of research. Curriculum 
development and instruction is also discussed. Section 3, "Responses 
by the Library Profession to Changing Research Practices," includes 
promoting connectivity; conversion; software engineering; and 
transformations in professional roles. A fourth section, "Conclusion 
and Recommendations," is divided into the following parts: 
establishing a network-mediated archival practice; establishing a 
strategy for the future usability of electronic records; and 
recognizing and rewarding leadership. (Contains 224 references.) 
(AEF) 
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EXECUTIVE SUMMARY 

The emergence and use of information 
technology is this century's most signifi- 
cant development affecting archival prac- 
tice. In response to this development, 
members of the archival profession have 
explored both the ways in which new tech- 
nology can improve the management of ar- 
chives, and the most appropriate methods 
for managing electronic records that result 
from the use of automated systems. But 
these two issues only partially address the 
impact of technology on archives. A third 
but indirect influence also deserves exam- 
ination: technology's impact on scholarly 
research methods, which has consequences 
for the use and management of archives. 



This article considers the policy implica- 
tions tor archives of trends resulting from 
the infusion of information technology into 
the scholarly research process. 

The article considers the interaction of 
two distinct kinds of trends: trends in in- 
formation technology and trends in re- 
search practices, particularly among social 
scientists and humanists. Although much 
of the rapid growth and evolution of infor- 
mation technology may be unrelated to 
scholarly research, and aspects of scholarly 
research may be evolving in ways that have 
little connection with information technol- 
ogy, there is nevertheless a strong and im- 
portant interaction occurring between these 
two evolutions. The possibilities created by 



ERLC 



6 



238 



American Archivist / Spring 1992 



new technology are prompting transfor- 
mations in scholarly practice, and these 
transformations are in turn stimulating new 
needs among researchers and are inspiring 
further technological breakthroughs. Un- 
derstanding the nature, of this interaction is 
necessary for forecasting the most likely 
ways in which new scholarly methods will 
demand innovative services and responses 
from the archival community. 

This article explores two fundamental 
trends in information technology affecting 
scholarship: end-user computing and con- 
nectivity. Several other technologies of rel- 
evance to scholarship are also considered, 
including artificial intelligence, end-user 
publication and distribution, hypcnext and 
hypermedia, and visualization and virtual 
reality. Changes in the research process re- 
sulting from scholarly use of information 
technology are considered within the broad 
framework of scholarly communication. The 
scholar's use of currently available tech- 
nology to search for sources, communicate 
with colleagues, interpret and analyze source 
materials, disseminate research findings, and 
prepare curriculums and instructional ap 
plications is examined. Our key finding is 
the exploding use among researchers of in- 
formation technology on research and ed- 
ucation networks to advance scholarship. 
Far from being visionary, this uiture is al- 
ready present: It is currently being experi- 
enced by significant and increasing numbers 
of scholars from many disciplines. The li- 
brary profession is responding to the emer- 
gence of network-mediated scholarship by 
promoting global connectivity, performing 
conversions of print sources to machine- 
readable form, undertaking the software 
engineering of full-text delivery systems for 
online materials, and collaborating with 
technologists in the use of computing and 
communication technology to meet spe- 
cialized researcher needs. 

The report recommends that the archival 
profession: 
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1. establish a presence on the Internet/ 
NREN. 

2. make source materials available for 
research use over the Internet. 

3. create documentation strategies to 
document network-mediated schol- 
arship and the development of re- 
search and education networks. 

4. develop archival methods suitab'e for 
operation with NREN. 

5. take user methods and future com- 
putational capacity into account in es- 
tablishing policies on the management 
of software-dependent records. 

6. recognize and reward initiatives that 
(a) advance archival management of 
electronic records, (b) respond to 
scholarly use of information technol- 
ogy, or (c) promote a network-me- 
diated archival practice. 

This article is the result of nearly two 
years of collaboration between Avra Mich-, 
elson and Jeff Rothenberg. Earlier versions 
or derivative presentations of the article were 
reported at annual meetings of the Society 
of American Archivists, National Associ- 
ation of Government Archivists and Rec- 
ords Administrators (NAGARA), National 
Net '92, and the Library of Congress 
Workshop on Electronic Texts. The article 
is available electronically on the file server 
operated by the Coalition for Networked 
Information. (Contact craig@chi.org for 
instructions.) 

INTRODUCTION 

The emergence and use of information 
technology is this century's most signifi- 
cant development affecting archival prac- 
tice. 1 In response to this development, 



l Thc term archives refers broadly to historic sources 
of enduring value that document the activities of gov- 
ernments, organizations, or individuals; it also refers 
to the repositories responsible for preserving and mak- 
ing available the historic record. 
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members of the archival profession have 
explored both the ways in which new tech- 
nology can improve the management of 
archives 2 and the most appropriate methods 
for managing electronic records that result 
from the use of automated systems. 3 But 



: Scc Marion Matters, cd., Automated Records and 
Techniques in Archives: A Resource Directory (Chi- 
cago: Society of American Archivists. 1 12-37, 
for' a bibliography on the topic. A selection of the 
seminal literature includes: Thomas H. Hickcrson, 
Archives and Manuscripts: An Introduction to Auto- 
mated Access, SAA Basic Manual Scries (Chicago: 
Society of American Archivists, 1981); Richard H. 
Lytic, "An Analysis of the Work of the National In- 
formation Systems Task Force," American Archivist 
47 (Fall 1984): 357-65 (sec also other articles in this 
issue of/L4, especially Thomas E. Brown, "The So- 
ciety of American Archivists Confronts the Com- 
puter," pp. 366-82); David Bcarman, Towards 
National Information Systems for Archives and Man- 
uscrip. Repositories: The NISTF Papers (Chicago: 
Society of American Archivists, 1987), as well as 
Bcarman's Archives and Museum Informatics tech- 
nical reports and quarterly newsletter; and two special 
issues of the Amerkan Archivist devoted to "Stan- 
dards for Archival Description" (Fall 1989 and Win- 
ter 1990). More recently, archivists have begun to 
explore the use of specific technologies to support 
archival functions. See, for instance: Optical Digital 
Image Storage System: Project Report (Washington, 
D.C: National Archives and Records Administration, 
Archival Research and Evaluation Staff, March 1991); 
Avra Michel son, Expert Systems Technology and Its 
Implications for Archives, National Archives Tech- 
nical Information Paper no. 9 (Washington, D.C: 
National Archives and Records Administration, Ar- 
chival Research and Evaluation Staff, March 1991); 
and Anne R. Kcnncy and Lynnc K. Pcrsonius, "The 
Future of Digital Preservation," Advances in Pres- 
cription and Access, vol. 1 (Wcstport, Conn.: Mcck- 
lcr Press, forthcoming). 

^Charles Dollar identifies salient literature on this 
topic in his work, The Impact of Information Tech- 
nologies on Archival Principles and Methods (Ma- 
ccraia, Italy: University of Maccrata Press, 1992). A 
selection of seminal publications includes: Charles 
Dollar, "Appraising Machine-Readable Records," in 
A Modern Archives Reader: Basic Readings on Ar- 
chival Theory and Practice, edited by May gene F. 
Daniels and Timothy Walch (Washington, D.C: Na- 
tional Archives and Records Service, 1984); Margaret 
L. Hcdstrom, Archives and Manuscripts: Machine- 
Readable Records, SAA Basic Manual Scries (Chi- 
cago: Society of American Archivists, 198<*); Harold 
Nauglcr, The Archival Appraisal of Machine-Reada- 
ble Records: A RAMP Study with Guidelines (Paris: 



these two issues aldress only a portion of 
the impact of technology on archives. A 
third though indirect influence that de- 
serves examination is technology's impact 
on scholarly research methods, which has 
consequences for the use and management 
of archives. This article considers the pol- 
icy implications for archives of trends re- 
sulting from the infusion of information 
technology into the scholarly research 
process. 

The term information technology refers 
to the computing and communications 
technology used to obtain, store, organize, 
manipulate, and exchange information. The 
definition includes computer hardware and 
software, as well as the telecommunica- 
tions devices and computer-based networks 
that connect <nem. 4 The influence of infor- 
mation techn >logy on the research process, 
already evident, promises to deeply pene- 
trate scholarly practice as we enter the 
twenty-first century. This technology is en- 
abling academics to change significantly the 
way they communicate and collaborate, 
identify and analyze sources, store and re- 
trieve data, and disseminate the products 
of their research. Although technology af- 
fects the research process across a spectrum 
of disciplines and professions, this article 
focuses on changes in the social sciences 



General Information Programme and UN1SIST, 
UNESCO, 1984); United Nations, Administrative 
Committee for the Coordination of Information Sys- 
tems, Technical Panel on Records Management, Elec- 
tronic Records Guidelines: A Manual for Policy 
Development (New York: United Nations, 1989); and 
Research Issues in Electronic Records. (St. Paul, 
Minn.: Published for the National Historical Publi- 
cations and Records Commission, Washington, D.C, 
by the Minnesota Historical Society, 1991). Sec also 
Tom Rullcr, "Managing and Appraising GIS Data: 
Issues and Strategics," unpublished paper presented 
at the 1991 annual meeting of the Society of American 
Archivists, Philadelphia. 

Mohn R. B, Clement, "Increasing Research Pro- 
ductivity Through Information Technology: A Uscr- 
Ccntcrcd Viewpoint," unpublished paper, 19 October 
1989, p. 3. 
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and humanities because scholarly patrons 
of archives tend to be drawn most heavily 
from these fields. 5 

Undertaking the research for this report 
was motivated, in part, by efforts in the 
archival profession to provide answers to 
questions related to the use of source ma- 
terials, such as the following: Who are the 
(potential) users of primary sources? What 
are the characteristics of the modern re- 
search process? How do patrons frame re- 
search questions? 6 In the past few years, 
several empirical studies have been con- 
ducted on patterns of research use within 
or across repositories or specific disci- 
plines. 7 Although these studies provide 



s Thc terms scholar and researcher generally arc 
used throughout the paper to refer to social scientists 
and humanists unless specified otherwise. Neverthe- 
less, we believe that the research trends identified in 
this report apply to a broader range of the research 
community. 

ft A selection of key literature that has advanced the 
archival profession's conceptual framework includes: 
Mary Jo Pugh, "The Illusion of Omniscience: Subject 
Access and the Reference Archivist," American Ar- 
chivist 45 (Winter 1982): 33^4; Elsie T. Freeman, 
"In the Eye of the Beholder: Archives Administration 
from the User's Point of View," American Archivist 
47 (Spring 1984): 111-23; Paul Conway, "Facts and 
Frameworks: An Approach to Studying the Users of 
Archives,*' American Archivist 49 (Fall 1986): 393- 
407; and Lawrence*. Dowlcr, "Availability and Use of 
Records: A Research Agenda,*' American Archivist 
51 (Winter/Spring 1988): 74-86. 

7 A selection of the key studies includes: Major 
Findings, Conclusions and Recommendations of the 
Researcher and Public Service Component Evaluation 
Study (Ottawa: Public Archives of Canada, 1985); 
Paul Conway, "Research in Presidential Libraries: A 
User Study,** Midwestern Archivist 11 (1986): 35- 
56; William J. Mahcr, "The Use of User Studies,** 
Midwestern Archivist 11 (1986): 15-26; David Bear- 
man, "User Presentation Language in Archives, Ar- 
chives and Museum Informatics 3 (Winter 1989-90): 
3-7; Paul Conway, Partners in Research: Towards 
Enhanced Access to the Nation's Archives (Washing- 
ton, D.C.: National Archives and Records Adminis- 
tration, forthcoming); and Ann D. Gordon, Using the 
Nation 's Documentary Heritage: The Report of the 
Historical Documents Study, supported by the Na« 
tional Historical Publications and Records Commis- 
sion in cooperation with the American Council of 
Learned Societies (Washington, D.C.: National His* 
torical Publications and Records Commission, 1992). 



valuable insights on users and patterns of 
use for the period of study, they typically 
fail to consider their findings within the 
context of a broader analysis of scholarly 
research trends. 

Archivists need more than snapshots as 
a basis for policy formulation. An accurate 
depiction of current research practices is 
necessary, but archival strategic planning 
must also involve an analysis of significant 
trends. This article addresses the interac- 
tion of two distinct sets of trends. Elec- 
tronic information technology as a 
phenomenon is experiencing raj. id growth 
and evolution, much of which - ly be un- 
related to scholarly research. /\t the same 
time, aspects of scholarly research may be 
evolving in ways that have little connection 
with information technology. Neverthe- 
less, a strong and important interaction is 
occurring between these two movements. 
The possibilities created by new technol- 
ogy are prompting transformations in 
scholarly practice, and these transforma- 
tions are in turn stimulating new needs 
among researchers and further inspiring 
technological breakthroughs. Understand- 
ing the nature of this interaction is neces- 
sary for forecasting the most likely ways in 
which new scholarly methods will demand 
innovative services and responses from the 
archival community. 

Trends analysis is inherently somewhat 
circular, since technological changes 
"drive" changes in scholarly practice only 
to the extent that the new technology pro- 
vides capabilities that scholarly researchers 
can use in meaningful and productive ways. 
It invoives more than the description of ar- 
bitrary technological trends: Their rele- 
vance must be derived from the perspective 
of scholarly research. It also involves more 
than the description of current trends in 
scholarship: To the extent that scholarship 
uses information technology, it is neces- 
sarily constrained by what is currently pos- 
sible. Only by considering the joint evolution 
of technology and scholarly methods can a 
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convincing picture of the future be con- 
structed. The remainder of this article at- 
tempts to create such a picture in order to 
examine its implications for archives dur- 
ing this decade and beyond the turn of the 
millennium. 8 

This article presents a conceptual frame- 
work for understanding long-term trends 
relevant to the scholarly research process. 
The topic is introduced by a discussion of 
scholarly communication and the early use 
of computers among academics. An analy- 
sis of information technology trends most 
pertinent to the conduct of research fol- 
lows. The third section explores, through 
case examples, trends in the use of cur- 
rently available ihiormation technology by 
social scientists and humanists. The fourth 
section considers model efforts by those in 
the library profession to respond to changes 
in the research process. The article con- 
cludes with policy recommendations that 
address key changes needed in archival 
practices and methods to respond to trans- 
formations in scholarly research methods, 
and the growing prominence of a new elec- 
tronic communication medium— research 
and education networks. 

BACKGROUND 

Scholarly inquiry represents a timeless 
human quest to understand the world around 



"Because this paper examines the interaction of two 
distinct trends, differing frameworks arc used to or- 
ganize the key sections (Overview of Information 
Technology Trends and Scholarly Communication and 
the Use of Current Information Technology). The for- 
mer uses information technology trends as the organ* 
izing framework, whereas the latter uses the elements 
of scholarly communication as a structuring frame- 
work. The relationship between technology and schol- 
arship is both dynamic and complex, and our 
understanding of it continues to evolve. Although it 
was suggested to us that the framework used to cx» 
plorc information technology trends should be used 
as the organizing framework for the section on current 
scholarly practices as well (e.g., a more technological 
dctcrminist approach), we consider the dual frame* 
works, and the analysis of the relationships between 
them, one of the paper's key virtues. 



us. Although this quest for understanding 
is a sustaining element of human culture, 
the techniques of the scholar have changed 
over time. No longer characterized by oral 
tradition and forum dialogues, the modern 
research process is commonly understood 
to entail five processes: (1) identification 
of sources, (2) communication with col- 
leagues, (3) interpretation and analysis of 
data, (4) dissemination of research find- 
ings, and (5) curriculum development and 
instruction for preparing the next genera- 
tion of scholars. Refinement of the schol- 
ar's original idea or hypothesis occurs 
throughout these more tangible processes. 
The impact of information technology on 
these processes is resulting in unprece- 
dented transformations in scholarly com- 
munication. 

Scholarly communication is the term used 
to refer to the interrelationship of the five 
processes of modern scholarship. 9 The term 
implies both a dynamic exchange of infor- 
mation and ideas and an interdependence 
among publishers, librarians and others in 
the support of scholarship and the advance- 
ment of knowledge. Scholarly communi- 
cation is generally understood to involve 
the social exchange of intellectual and cre- 
ative activity from one scholar to another. 10 
As a concept, it denotes a recognition of 
the mutual reliance of researchers, publish- 
ers, professional associations, and libraries 
and archives in fostering intellectual pur- 



ine American Council of Learned Societies pop- 
ularized the term scholarly communication among ac- 
ademics as a result of their mid-1980s survey on the 
experience of more than five thousand humanists as 
authors using scholarly publications, libraries, and 
computers. The findings of the report appear in Her- 
bert C. Morton and Anne J. Price, The ACLS Survey 
of Scholars: Final Report of Views on Publications, 
Computers, and Libraries (Washington, D.C.: Office 
of Scholarly Communication and Technology, Amer- 
ican Council of Learned Societies, 1989). 

'"Thomas W. Shaughncssy, "Scholarly Commu- 
nication: The Need for an Agenda for Action— A 
Symposium,* * Journal of Academic Librarianship 15 
(May 1989): 69. 
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suits. This interdependence implies that a 
change in the practice of any one of these 
agents is capable of inspiring changes in 
the entire paradigm. In transforming the way 
in which academics learn of primary source 
materials, search and gather data, interpret 
and analyze sources, and report findings to 
the scholarly community, information tech- 
nology is influencing significant aspects of 
scholarly communication. Consequently, 
changes in scholarly research patterns have 
ramifications for archives and libraries. 11 

The influence of modern technology on 
scholarly communication began with the 
birth of computers. More than forty years 
ago, the scientific community was the first 
of the academic disciplines to introduce 
computers into the research process. As 
computing power expanded, geographi- 
cally dispersed scientists began collaborat- 
ing on research questions requiring 
computers. In 1969, in response to the needs 
of this community, the U.S. Defense De- 
partment's Advanced Research Projects 
Agency (ARPA) d eloped the ARPA- 
NET, a telecommunications network de- 
signed to allow the sharing of expensive 
computer resources among government and 
academic research laboratories. 12 Scientific 
computing has evolved to include the use 
of electronic networks for electronic mail 
(e-mail) and for access to supercomputing 
processing power and to software that fa- 
cilitates group work. 13 



"For a historical consideration of the relationship 
between scholarly communication and libraries, sec 
Phyllis Dain and John C. Cole, cds.. Libraries and 
Scholarly Communication in the United States: The 
Historical Dimension, Beta Phi Mu Monograph, no. 
2 (New York: Greenwood Press, 1990). 

,2 Clifford A. Lynch and Cecilia M. Preston, "Ev- 
olution of Networked Information Resources," Pro- 
ceedings of the Twelfth National Online Meeting May 
7-9, 1991 N.Y., N.Y., Martha E. Williams, cd. (Mcd- 
ford, N.J.: Learned Information, 1991): 221-30. 

n Scc, for instance, a recent book of readings, ed- 
ited by Irene Grcif, Computer- Supported Cooperative 
Work (San Mateo, Calif.: Morgan Kaufmann, 1988); 
for an assessment o( the information needs of the sci- 



Since the 1970s, a large and complex 
array of networks has emerged to support 
collaborative scientific research. As the 
scientific need for connectivity increased, 
network infrastructures at institutions, or- 
ganizations, commercial enterprises and re- 
gions expanded. Today, more than three 
thousand regional, federal, commercial, and 
organizational networks connect an esti- 
mated 5 million scholars in seventy coun- 
tries. 14 The Internet, the existing network 
of research and education networks, com- 
prises thousands of trunk lines that cur- 
rently carry from 1.5 to 45 million bits per 
second. 13 The National Research and Ed- 
ucation Network (NREN), authorized in 
1991 and due to be operational by 1995, 
will be capable of transmitting 1 billion bits 
of data— the equivalent of fifty thousand 
typewritten pages—every second. 16 

In recent years, the global expansion of 
electronic networks has allowed for world- 
wide collaboration among scientists. Fur- 
ther, the connectivity provided by greater 
bandwidth lets scientists process previously 
unimaginable amounts of data. Expanding 
the volume of data able to travel across 
networks permits scientists to explore new 
types of questions because greater amounts 
of data are available with less time required 
for analysis. Equally important, the prom- 



cnti Tic scholar, sec Communications in Support of Sci- 
ence and Engineering: A Report to the National Science 
Foundation from the Council on Library Resources 
(Washington, D.C.: The Council, August 1990); for 
a discussion of statc-of-thc-art collaboration-oriented 
software, sec Daniel Williams, "New Technologies 
for Coordinating Work," Datamation 36 (15 May 
1990): 92-96. 

^Clifford Lynch, "Tclccomrunications and Net- 
working: A Tutorial," presentation made at the 
American Society for Information Science 54th An- 
nual Meeting, Washington, D.C. (29 Octobci 1991). 

l *Lynch and Preston, "Evolution of Networked In- 
formation Resources." 

lfc Frum a presentation by Paul Peters, executive di- 
rector of the Coalition for Networked Information, to 
the National Archives and Records Administration on 
7 May 1991; sec also Ralph Albcrico, "The Devel- 
opment of an 'Information Superhighway'," Com- 
puters in Libraries 10 (January 1990): 34. 



BEST. COPY AVAILABLE 



11 



Scholarly Communication and Information Technology 



243 



ise of increased computing power and ad- 
vances in telecommunications will allow 
scientists to expand the graphical display 
of research results, alleviating many prob- 
lems associated with interpreting very large 
data sets. 17 The trends characteristic of 
modern scientific inquiry— greater collab- 
oration, increased use of computer-assisted 
analysis of machine-readable sources, and 
expanded use of global research and edu- 
cation networks— increasingly represent 
trends in the social sciences and humanities 
as well. 

In the humanities, scholars initially used 
computers simply to store and retrieve data. 
In what is commonly believed to be the 
earliest project of its kind, Father Roberto 
Busa in 1949 began his effort to compile 
an index and concordance to the work of 
St. Thomas Aquinas. 18 But apart from the 
hard sciences, the field of political science 
is typically regarded as the discipline most 
responsible for transforming computer 
processing into an accepted scholarly 
method. What began as a simple use of 
computers by political scientists for 
processing survey data and analyzing na- 
tional opinion polls became a standard so- 
cial science methodology: quantitative 
analysis. During the past four decades, fol- 
lowing the lead of survey researchers, a 
range of scholars within academic disci- 
plines began to use computer technology to 
process large sets of numeric data. 19 



l7 Clcmcnt, "Increasing Research Productivity," 3; 
for a discussion of the role of imagery in human un- 
derstanding, sec Mary Alics White, "Imagery in Mul- 
timedia," Multimedia Review (Fall 1990): 5-8. 

IR Scc David S. Miall, cel., Humanities and the 
Computer: New Directions (Oxford: Clarendon Press, 
1990), 2. 

'''As their numbers grew, quantitative scholars suc- 
cessfully campaigned for the establishment of -jata 
archives, special repositories designed to preserve and 
provide access to machine-readable collections of sur- 
vey, census, polling, and legislative data. Sec Kath- 
leen M. Hcim. "Social Scientific Needs for Numeric 
Data: The Evolution of the International Data Archive 
Infrastructure," Collection Management 9 (Spring 
1987}: 1-53. 



The advance of information technology 
over the past several decades has aston- 
ished even the most visionary technolo- 
gists. Although certain predictions have 
proved too optimistic, the overall rate of 
advance has matched or surpassed the pro- 
phesies of most experts, and it shows every 
sign of continuing unabated during the nrxt 
few decades. Indeed, from 1980 to 1985, 
the period that marked the birth of personal 
computers, their use among scholars soared 
from nonexistent to more than 50 per- 
cent. 20 Today, the scholarly use of personal 
computers extends beyond storage and re- 
trieval of data and includes text editing, 
formatting, and text analysis. Increasingly 
scholars are turning to technology to do sta- 
tistical analysis, create databases, produce 
spreadsheets, and compile graphical im- 
ages of data. Many scholars consider tech- 
nology an essential instructional tool for 
generating simulations, capturing data, and 



2 "Morton and Price, ACLS Survey of Scholars, 33. 
The ACLS study represents the only currently avail- 
able direct survey of scholars on their use of com- 
puters. But the survey polled only scholars who arc 
members of professional associations. For the past 
few years, EDUCOM and the University of Southern 
California have conducted an annual survey of aca- 
demic computing directors on campus planning, pol- 
icies, and procedures affecting the use of desktop 
computers. According to reports by academic com- 
puting centers, 39.5*3? of faculty at two year public 
and four year public and private colleges and univer- 
sities have access to or own computers. This figure, 
however, is considered unreliable, as it is based on 
estimates by academic computing staff, rather than on 
direct counts. Furthermore, no one believes that actual 
usage has dropped from 1985 to 1991, as implied by 
the discrepancy between the ACLS and EDUCOM' 
USC figures. According to Kenneth C. Green, the 
EDUCOM/USC survey developer and author of the 
report on the findings, "our limited knowledge about 
student and faculty access to and use of technology is 
appalling." Green argues that a direct survey of schol- 
ars is needed to identify actual computer usage. Sec 
USC Center for Scholarly Technology Newsletter, 
"Despite Budget Cuts, Campuses Attempt to Main- 
tain Computing Services," (October 1991); Kenneth 
C. Green, "A Technology Agenda for the WOs," 
Change 23 (January/February IW): 6-7; and Ken- 
neth C. Green and Skip Eastman, Campus Computing 
I WO (Los Angeles: University of Southern Califor- 
nia, Center for Scholarly Technology, 19 ( )0). 
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providing individualized assistance to stu- 
dents. 21 

The driving force behind the advance of 
information technology has been the de- 
velopment of faster, smaller, and cheaper 
electronic devices, which can be used to 
produce machines with greater capabilities 
for manipulating and processing informa- 
tion. These machines have in turn inspired 
the production of more powerful and im- 
aginative programs and solution techniques 
(computational methods or algorithms) for 
solving problems that v^ould be intractable 
without this new computational power. The 
availability of increased computational 
power, in turn, has enabled the design of 
new computer hardware and software, pro- 
ducing a snowball effect in which each new 
generation of system facilitates the design 
of its successor. This process can be ex- 
pected to continue until designers reach the 
fundamental limitations of physics and ex- 
haust all technological alternatives, which 
ooes not appear imminent. An improve- 
ment in computational power of six orders 
of magnitude (a factor of a million) over 
the past two decades can be attributed to 
roughly equal improvements (three orders 
of magnitude each) in hardware and soft- 
ware. 22 It is not unreasonable to expect a 
comparable improvement to occur over the 
next two or three decades. As a result, in 
the next few decades an unimaginable 
amount of computational power will be 
available to scholars. This capacity com- 
pels the archival profession to determine 
the implications of the use of information 



-'Sec Miall, Humanities and the Computer, 4; and 
Jcan-Claudc Gardin, "The Future Influence of Com- 
puters on the Interplay Kciv;cn Research and Teach- 
ing in the Humanities/' Humanities Communication 
Newsletter 9 (1987): 17-18. 

22 Crand Challenges: High Performance Computing 
and Communications, The FY 1992 U.S. Research 
and Development Program, A Report by the Com - 
mittee on Physical, Mathematical, and Engineering 
Sciences, Federal Coordinating Council for Science, 
Engineering and Technology, Office of Science and 
Technology Policy (1991), 14-15. 



technology by scholars for conventional ar- 
chival practices. 

Although the future evolution of infor- 
mation technology is fairly predictable in 
broad outline, predicting precise details of 
how the technology will evolve is more dif- 
ficult. For our purposes, however, it is the 
broad outline of these trends that is most 
important. Our discussion of technology, 
therefore, avoids mentioning specific de- 
vices, techniques, or research results. In- 
stead, the next section examines trends of 
information technology that are likely to 
have the greatest impact on scholarly com- 
munication—and, by implication, on ar- 
chives management. The focus here is on 
broad descriptions and projections most 
relevant to the future of scholarly research. 
Later in this paper we examine how schol- 
ars aie actually using information technol- 
ogy in their current work. 

OVERVIEW OF INFORMATION 
TECHNOLOGY TRENDS 

The two most obvious— and for the pur- 
pose of this paper, the most important- 
information technology trends that pertain 
to scholarly communication are end-user 
computing and connectivity. These trends 
are distinct and separable, and each is dis- 
cussed in detail below. Ultimately, how- 
ever, it is the integration of the two that 
will have the greatest impact on scholarly 
communication. End-user computing en- 
hances the autonomy of the researcher, i.e., 
the researcher's ability to use the power of 
computation to conceptualize and execute 
research without sacrificing intellectual 
control by delegating computational tasks 
to specialists. Connectivity enhances the 
researcher's abilities to access data, collab- 
orate, sesk input and feedback, and dis- 
seminate ^deas and results. The confluence 
of these trends produces a rich interplay of 
synergistic effects, which are explored be- 
low. 

A number of more specific technology 
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trends are also likely to affect scholarly 
communication. Most of these are exam- 
ples of end-user computing or connectivity 
(or the integration of the two), but each 
warrants attention in its own right. The most 
relevant of these appear to be artificial in- 
telligence, end-user publication and distri- 
bution, hypermedia, and visualization and 
virtual reality. 

End-User Computing 

In the current context, end-user comput- 
ing refers to the direct use of computers by 
researchers, 23 The general trend toward the 
increased use of computers is understand- 
able. Computers continue to become bet- 
ter, cheaper, more accessible, and more 
usable. Software continues to become more 
application-oriented, and user interfaces 
continue to improve. Databases continue to 
become larger and more relevant. As the 
use of computers becomes more common, 
.users continue to increase in number and 
sophistication, generating greater and greater 
demand for computation while driving prices 
even lower by expanding the size of the 
market. But the increasingly direct use of 
computers by their end-users is a more re- 
cent and more interesting trend, and its im- 
plications for research are profound. 

The term end-user refers to someone who 
physically uses a computer— the person who 
touches the keyboard and reads the screen. 



"For most users, the trend toward direct access 
began with personal computers (PCs), but it actually 
began soon after the advent of the modern computer. 
The very first computers of the early 1950s were es- 
sentially singlc'uscr machines and, since users had to 
be very aware of their machines* foibles (and typically 
had to be present while running their programs in 
order to deal with problems), they necessarily became 
intimate end-users. Later, more reliable mainframe 
computers often ran jobs in batch mode (batches of 
work were run together instead of individually) to im- 
prove their utilization, which tended to distance users 
from their machines. In the early 1960s, however, 
timesharing reintroduced direct access by allowing 
multiple users to share a manframc machine remotely 
from their terminals. 



The end-user may or may not initiate or 
consume the results of the computation. It 
is useful to distinguish the end-user from 
the "ultimate user" of a computer: some- 
one who initiates and consumes the results 
of a computation, without necessarily 
touching or seeing the machine. The ulti- 
mate user is the person who causes a com- 
putation to be performed and who uses the 
results of the computation, i.e., the person 
whose work involves computation, whether 
or not it involves using a computer directly. 

End-user computing occurs when the end- 
user and the ultimate user are the same. 
The crux of end-user computing is that the 
end-user is able to initiate computations and 
get results without going through an inter- 
mediary. To some extent, this is a detail: 
What difference does it make if a compu- 
tation is performed by a researcher or a 
programmer? But the distinction is an im- 
portant one, since it bears on how central 
the computation is to the researcher's thought 
process. If a researcher is the ultimate user 
of a database, for example, but is not the 
end-user, then some intermediary (librar- 
ian, data archivist, programmer, secretary, 
or assistant) is interposed between the re- 
searcher and the database, limiting the re- 
searcher's ability to interact directly with 
the data, to browse through it, to explore 
its idiosyncrasies, and to become intimate 
with it. Similarly, if a researcher asks 
someone else to write a program to com- 
pute summary statistics, the researcher will 
be unaware of the decisions embedded in 
that program or the problems encountered 
in writing it. 24 This kind of insulation from 
the computational process may free the re- 
searcher from menial tasks, but it also lim- 
its his or her ability to define the computation 



"Although writing a program docs not guarantee 
that one will become— let alone remain— aware of its 
implications and limilations, using a program written 
by someone else virtually guarantees that the user will 
not be aware of them. 
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correctly, use it appropriately, and under- 
stand the implications of its results. 

From a practical standpoint, end-user 
computing is attractive because of its con- 
venience. An end-user need not find a pro- 
grammer or data processing specialist (and 
an available machine) to get an answer to 
a computational problem. This reduces the 
threshold of effort required to perform 
computation, allowing users to consider it 
a more integral part of their work style. 

The ramifications of end-user computing 
in the research process are deeper and more 
subtle than they may first appear. Only by 
becoming intimate with the computational 
process can a researcher fully realize the 
potential of computation in performing re- 
search. Only when the researcher is an end- 
user does computing become familiar 
enough and convenient enough to be a nat- 
ural part of the research process. This is 
not an end in itself, but it is important be- 
cause it allows the researcher to conceive 
of new kinds of research that become pos- 
sible only when computation becomes an 
integral part of research. End-user com- 
puting is an important trend because the 
activity of computation allows researchers 
to reconceive the nature of research itself, 
i.e., the kinds of questions posed, the 
methodologies used, the type and extent of 
sources analyzed, and the form of presen- 
tation of the findings. (Examples are dis- 
cussed in a later section.) 

To summarize: End-user computing 
means direct access to computational ca- 
pability; the key implication of this in the 
current context is that it allows computa- 
tion to become an integral part of a re- 
searcher's thought process— and therefore 
of the research itself. 

Ubiquitous computing. One trend that 
is stil! relatively new is the advent of port- 
able computing, using laptop, notebook, or 
even pocket-sized ("palmtop") computers. 
This portability means more than just being 
able to carry a computer from one location 
to another. It implies the ability to carry a 



part of one's working context (database, 
text, notes, and correspondence) in a ma- 
chine that can be used on location, in meet- 
ings, or while traveling. This context may 
be "downloaded" to a portable machine 
from a researcher's home machine and used 
for on-site research or during interactions 
with other researchers to modify data, re- 
cord notes, work on evolving documents, 
and many other tasks. The results of this 
work can thea be "uploaded" to the re- 
searcher's home machine, by a telecom- 
munications link from the remote location 
or by a direct transfer of data after the re- 
searcher returns home. 

In addition to portable machines them- 
selves, cellular modems (modulator/de- 
modulators) allow computers to 
communicate over cellular telephone links. 
This allows the user to link computers while 
traveling anywhere that cellular telephone 
coverage is provided; it is already possible 
to connect to a remote computer or data- 
base from a portable computer while riding 
in a taxi in any major city in the United 
States. Whether this kind of remote com- 
puting will ultimately become a common 
activity depends on tradeoffs between the 
size, cost, and capacity of portable versus 
remote computers and the attendant tele- 
communications costs. 

The important point is not the size and 
capability of portable machines, but rather 
the freedom they give the user to perfovm 
computations and to access data from any 
location. For example, another way of 
achieving the same result would be to pro- 
vide computer terminals in public places; 
this would be analogous to the use of stan- 
dard (noncclkilar) telephones, which are 
ubiquitously available anywhere in the de- 
veloped world. The French government has 
implemented just such an approach to com- 
puting in its Minitel system, which is avail- 
able in homes and post offices throughout 
France. 25 Because of these alternatives, it 



2 *David M;irgu!ius, "Cost la France, C'cst Min- 
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is useful to think of this as a trend toward 
"ubiquitous computing" rather than "port- 
able computing." This is discussed further 
under Connectivity below. 

End-user interfaces. The design of soft- 
ware for end-users has also had a tremen- 
dous impact on the growth of end-user 
computing. For end-users who are not 
computer specialists, "access" to compu- 
tation means more than simply having a 
computer or communicating with one. To 
use . computer effectively, such users need 
software that allows them to work in ways 
that are natural to them, without having to 
learn the intricacies of an arcane computer 
system. Software for end-user computing 
- must have two key attributes: It must pro- 
vide functionality that is of use to the end- 
user, and it must present an interface that 
is usable by an end-user. 

Appropriate functionality requires that 
software be either generically useful (such 
as word processors, electronic mail, data- 
bases, spreadsheets, and mathematical pro- 
grams) or designed for some specific task, 
that the user performs. Task-specific pro- 
grams (or applications) tend to be written 
for users in a given industry or type of 
work. 26 But if its interface makes it diffi- 
cult to use, neither generic nor task-specific 
software is of much value to any but the 
most dedicated and tenacious of end-users. 

The trend toward improving end-user in- 
terfaces began in the early 1960s. 27 Many 



itcl," PC Computing 2 (January 1989): 194; Ellis 
Booker, 'Vive Ir Minitcl," Telephony 215 (8 August 
1988): 24; and S. Nora and A. Mine, The Comput- 
erization of Society: A Report to the President of France 
(Cambridge, Mass.: MIT Press, 1980). 

w Bolh general-purpose and task-specific programs 
become more useful when they can be tailored to the 
needs of a particular end-user. Examples of this arc 
word processors that allow users to define their own 
document formats, function keys, "macros," etc. The 
ultimate general-purpose program is a programming 
system (or language) that allows end-users to define 
new computations at will (i.e., to write programs); 
end-users may become programmers to a limited ex- 
tent by lailoiing software to their own needs. 

2 Tor example, Cliff Shaw's JOSS system is widely 



of the principles of current user interfaces 
were developed by Engelbart's group at the 
Stanford Research Institute (SRI) in the 
1960s and early 1970s. 28 This led to the 
development of a number of systems at Xe- 
rox Corporation's Palo Alto Research Cen- 
ter (PARC) in the late 1970s, culminating 
in the introduction of the Star in 1981. 29 
The Xerox Star pioneered the point-and- 
click, window- and menu-tnsed "desktop 
metaphor" that is currently in vogue. This 
trend toward better user interfaces gained 
momentum with the development of per- 
sonal computers, and it has now reached a 
point where many systems can be learned 
and used effectively by most users without 
any formal computer training. Although the 
term user friendly has become such an ad- 
vertising cliche that it is now all but mean- 
ingless, its overuse is a measure of the extent 
to which the computer industry recognizes 
the importance of user interface design for 
end-user computing. 

The "online transition." One of the key 
factors that facilitates end-user computing 
is an "online transition" 30 in which corn- 



regarded as one of the earliest successful limcsharcd 
systems designed for direct access by researchers. Sec 
J. C. Shaw, JOSS: Conversations with the Johnniac 
Open-Shop System (Santa Barbara, Calif.: RAND 
Corporation, P-3146, 1965); J. C. Shaw, "JOSS: A 
Designer's View of an Experimental On-Ltnc Com- 
puting System," in American Federation of Infor- 
mation Processing Societies Conference Proceedings 
(Fall Joint Computer Conference), Vol. 26 (Ba l ;i- 
more, Md.: Spartan Books, J 964): 455-64. 

:s In addition to inventing the mouse, this visionary 
group developed many of the concepts that form the 
foundation of modern user interface design, as well 
as producing one of the first hypertext systems. For 
an early description of this work, sec D. C. Engclbart 
and W. K. English, "A Research Center for Aug- 
menting Human Intellect,** American Federation of 
Information Processing Societies Conference Pro- 
ceedings (Fall Joint Computer Conference) vol. 33. 
(May 1974), 395-410. 

29 J. Johnson, T. L. Roberts, W. Vcrplank, D. C. 
Smith, C. H. Irby, M. Beard, and K. Mackcy, "The 
Xerox Star: A Retrospective/ 1 IEEE Computer 22 
(September 1989): 11-26. 

v Thc term online originated in the electric power 
industry. Generating equipment is said to be "online" 
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puting becomes more useful the more it is 
used. If a user is still L^und to the tele- 
phone, paper mail, paper documents, paper 
files, and paper memos, then computation 
remains an infrequently used tool that does 
not integrate with the rest of the work en- 
vironment. When electronic mail (e-mail) 
begins to replace telephone and paper mes- 
sages and when machine-readable elec- 
tronic documents and flies begin to replace 
paper, the user's working context is inte- 
grated in new ways. 

The online transition produces a new 
phenomenon: Many previously separate 
forms of communication become integrated 
by being stored in electronic form. For ex- 
ample, if telephone messages and tele- 
phone directories are both electronic, users 
can forward information from a phone mes- 
sage in e-mail and can use telephone num- 
bers or other information from a phone 
message to search their phone directories 
for information about callers. Many mes- 
sages that traditionally have come by tele- 
phone will in the future be sent by e-mail 
instead, since e-mail is asynchronous (the 
recipient does not have to be present to re- 
ceive an e-mail message) and provides a 
more legible and reliable medium for mes- 
sages containing text or data. Similarly, 
users can easily copy text from letters, 
memos, and informal messages into new 
documents and search their contents elec- 
tronically, rather than visually scanning vo- 
luminous printed material. 



when it is connected to a power distribution grid. The 
term is used in information science to refer to infor- 
mation and other resources being electronically ac- 
cessible to users by means of computers and 
communication devices. Similarly, it refers to users 
being able to access their work resources electroni- 
cally, i.e., having terminals, communication facili- 
ties, computer accounts, etc., as needed to work in 
this way. (Information that is not accessible in this 
way, or users who do not have access to their work 
in this way arc referred to as being "offline") The 
Icim online as used in the database and lihrary do- 
mains is derivative and analogous but considerably 
narrower. It is used here in its more general sense. 



In the early stages of the online transi- 
tion, computation does not fully realize its 
potential because it is not yet integrated into 
the user's work style. This creates a chicken- 
and-egg problem. Users are not motivated 
to use computation until its benefits out- 
weigh the cost of learning to use it (and 
changing one's work style to make use of 
it); but its benefits arc realized only after 
it becomes an integral part of one's work 
style. This problem produces a learning 
curve in which progress initially is slow, 
but it accelerates as the online transition 
proceeds. This curve rises steeply above a 
certain point, when a critical mass of the 
user's context becomes integrated online. 

Summary. The exact ways in which 
computation will be delivered to end-users 
in the future will be determined by factors 
that involve tiade-offs among the costs of 
computers, various kinds of memory and 
communication, and issues of privacy, 
convenience, and control. The form in whic! 1 
computation is delivered will continue to 
evolve as the relative costs and benefits of 
various alternatives change. Ultimately, the 
end-user may not even know— and should 
not care— whether the response to a request 
is generated locally by the machine sitting 
on the user's desk, remotely by a special- 
purpose processor, or by some combination 
of the two. The importance of the trend 
toward end-user computing for researchers 
lies not in the details of its implementation 
but rather in its potential to transform 
scholarly communication by making com- 
putation an integral part of the researcher's 
thought process and work style. 

Connectivity 

The trend toward end-user computing is 
intimately related to the equally important 
trend of connectivity. This term describes 
the researcher's ability to access data, 
processing capabilities, and other research- 
ers electronically in ways that facilitate the 
research process. Connectivity is a broader 
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concept than communication. Like com- 
munication, connectivity includes the abil- 
ity of computers to talk to each other and 
to access remote databases, but it also in- 
cludes the ability of researchers to work 
together in useful ways, to solicit feedback 
from each other, to disseminate their ideas 
and results, and to integrate their research 
sources and products. Connectivity re- 
quires communication, but it further as- 
sumes that information is in a usable form 
that facilitates interchange and integration. 

Many aspects of end-user computing rely 
on connectivity. The online transition re- 
quires that a sufficient critical mass of the 
user's context be available online. That is, 
the various categories of data that comprise 
this context (such as telephone messages, 
e-mail, memos, and documents) must all 
be accessible electronically and must be 
stored in a common, interchangeable form, 
so that data can be shared and exchanged 
among these different categories. Conven- 
tional wisdom recognizes that a critical mass 
of users must be online before they will 
truly benefit from their connectivity, but it 
is at least as important that a critical mass 
of information and tools be online if users 
are to reap the benefits of connectivity. 
Furthermore, convenient and effective in- 
terchange must be available across this crit- 
ical mass of information and tools before a 
user can profitably make the online tran- 
sition. 

Access to databases also requires con- 
nectivity, especially if the user needs to see 
the most up-to-date version of dynamic data. 
Access to dynamic data is particularly im- 
portant for research, where the most recent 
additions to a database (representing new 
publications, ideas, data, or research) are 
often the most valuable, even though they 
may change only a small fraction of the 
overall database. If a database is static (i.e., 
does not change very ofte,«), it can be cop- 
ied onto local systems, either by physically 
sending disks to different sites or by down- 
loading data over a network (which again 



requires connectivity). However, if a re- 
mote database is dynamic, a user can see 
the most up-to-date version of the data only 
by either viewing the updated database over 
a network (relying on connectivity) or by 
updating a local copy of the database on 
demand (again, over a network) and view- 
ing the copy. Access to dynamic data there- 
fore depends on connectivity. 

An infrastructure of connectivity allows 
computation to be performed and data to 
be stored wherever it is most cost-effective, 
given that the relative costs of memory, 
computation, and communication are con- 
tinually changing. Connectivity allows 
computation and data to be reallocated from 
local to remote resources (computers, disks, 
etc.) as these costs change. This realloca- 
tion has traditionally required physical 
changes to system configurations (such as 
moving disk drives o; rewiring buildings 
with cables), but in principle this can be 
done without physical intervention, re- 
sponding automatically to changing costs 
or shifting demands. Connectivity there- 
fore facilitates end-user computing by al- 
lowing it to take advantage of evolving cost 
factors. 

The trend toward ubiquitous comput- 
ing—whether provided by portable com- 
puters, publicly available terminals, or other 
alternatives— relies on a similar form of 
connectivity to link users to their working 
''office" contexts by remote or portable 
access. Ultimately, it will become irrele- 
vant whether a user's working context ex- 
ists in a single place or is distributed over 
a number of sites and machines. Connec- 
tivity will allow users \o access their com- 
putational and informational contexts 
wherever and whenever they need them. 

Access to computational and human 
resources* Although access to data and 
one's working context is the most obvious 
aspect of connectivity, it has other impli- 
cations as well. In general, connectivity al- 
lows users to access resources. These may 
be data resources, but they may also be 
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specialized computational or human' re- 
sources. Two related initiativesintended to 
encourage such interactions by providing 
widely available, high-capacity networking 
are the National Research and Education 
Network (NREN) and the High Perform- 
ance Computing and Communications 
(HPCC) efforts. The capacity of a network 
is measured by its bandwidth, which is the 
number of bits of information it can trans- 
mit per second. 31 The NREN and HPCC 
efforts are targeted to produce gigabit (bil- 
lion-bit per second) transmission capacities 
during the next decade. 32 In addition to 
providing high-capacity "backbone" com- 
munications, related initiatives include ef- 
forts aimed at integrating the communication 
of text, images, voice, video, and other 
media. The NREN is intended to support 
the transmission of other media as well as 
text, although it should be noted that non- 
textual media require much greater trans- 
mission capacity. When fully implemented, 
NREN should greatly facilitate collabora- 
tion and resource sharing among research- 
ers. 

Efforts such as NREN also are important 
because, despite the evolution toward 
cheaper computers, there may always be 
state-of-the-art computing facilities that re- 
main too costly for individual researchers 
to own. For example, large parallel com- 
puters may allow searching through huge 
databases for complex patterns, but the most 
powerful of such machines may always be 
too expensive for any one researcher or even 
any one research facility to justify their 
purchase. Connectivity will allow research- 
ers to share such facilities through remote 
access. 

Beyond access to machines* connectivity 
allows researchers to communicate and col- 
laborate with each other and with spccial- 



u An average page of text consists of approximately 
20,000 bits, although this volume can be reduced 
(compressed) for transmittal. 

'"Grand Challenges , 17-19, 54. 



ists in other fields. The vast web of 
interconnected networks (sometimes re- 
ferred to informally as "WorldNet") al- 
ready allows researchers to broadcast or 
direct queries and requests by e-mail to a 
large proportion of the researchers in a given 
field, regardless of their nationality or lo- 
cation. This process is not always directly 
controlled by the initiator of a request: 
Queries may be forwarded by their initial 
recipients across networks and gateways 
between networks to individuals, electronic 
mailing lists, and electronic bulletin 
boards, 33 eliciting responses from distant 
and unlikely places. Integrated networking 
is greatly facilitated by an open systems 
approach, allowing multivendor software 
and hardware to communicate using stan- 
dard protocols. The International Standards 
Organization's Open Systems Interconnec- 
tion (OSI) reference model serves as a stan- 
dard for interconnection of this kind, 34 These 
developments are producing a truly global 
communication capability, which is ex- 
panding rapidly and spontaneously. 

The communication aspect of connectiv- 
ity goes beyond the use of e-mail for asking 
questions or broadcasting general infor- 
mation. It is causing a major shift in the 
way many researchers collaborate and in- 
teract. 35 The use of e-mail allows arbitrary 



^Electronic bulletin boards arc analogues of their 
physical counterparts. They allow online users to re- 
motely view notices posted electronically by other users. 

■^Thc OSI reference model is discussed in detail in" 
A. S. Tancnbaum, Computer Networks, 2d cd. (En- 
glewood Cliffs, N.J.: Prcnticc-Hall, 1988), 14-34. 

"We arc unaware of any research on c-mail use 
among scholars, but for recent studies on the use of 
c-mail and other collaborative electronic media in in- 
ternational organizations, sec T. K. Bikson and S. A. 
Law, Electronic Moil Use at the Bank: A Survey and 
Recommendations (Washington, D.C.: Information, 
Technology, and Facilities Department, World Bank, 
September 1991); and Tora K. Bikson and Sally Ann 
Law, "Electronic Information Media and Records 
Management Methods: A Survey of Practices in United 
Nations Organizations," ACCIS Electronic Informa- 
tion Media and Records Management Survey Report , 
A RAND Note (N-345J-RC) (Santa Monica, Calif.: 
RAND Corporation, 1991). 
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text and data files to be transmitted in sim- 
ple, linear text formats, without concern 
for machine compatibility or knowledge of 
remote file systems. Researchers can gen- 
erally transform any relevant information 
into text and send it as the body of a mes- 
sage. Transforming formatted information 
(such as structured documents or page lay- 
outs) into linear text so that it can be ex- 
changed in this way requires that the sender 
and recipient have software capable of per- 
forming the appropriate transformations. 
Standards for transforming such informa- 
tion into linear text are evolving in re- 
sponse to this need. For example, the 
Standard Generalized Markup Language 
offers a standard textual representation for 
structured documents, whereas Post- 
Script 36 offers a widely used de facto stan- 
dard textual representation for formatted 
page images. Such standards already allow 
users to send tcxtually encoded documents, 
pictures, or formatied page layouts by e- 
inail instead of on paper. The e-mail iecip- 
ient can view or print the transmitted in- 
formation after transforming it back to its 
original form. This capability will continue 
to improve as standards for graphics and 
other media evolve. 

Connectivity also promises to ''erase the 
geography" that separates students from 
teachers, classes, or other resources of in- 
terest. The educational notion of "distance 
learning" has evolved from the correspon- 
dence course to the use of televised instruc- 
tion, but networking allows a much richer 
form of educational interaction. Particu- 
larly in upper-level scholarly subjects, it is 
now possible to envision geographically 
distributed seminars that bring together in- 
terested scholars and students without re- 
gard to their physical locations. 

The use of e-mail, teleconferencing, and 



""Adobe Systems, Inc., PostScript Language Ref- 
vtcmv SI a nun I (Reading, Muss.: AddisonAVcslcv, 
1WU), 



remote windowing is producing a new phe- 
nomenon: computer-supported cooperative 
work (CSCW). 37 Through CSCW, groups 
of researchers can work together, sharing 
their context and coordinating their work, 
regardless of their locations, schedules, and 
work styles. Connectivity allows coopera- 
tion in all phases of research, including 
concept formation, literature and back- 
ground search, analysis, publication, peer 
review, and dissemination. This trend has 
the potential to both reduce the time re- 
quired to perform and publish research and 
improve its quality through earlier and wider 
review. CSCW also facilitates interdisci- 
plinary research through online discussion 
forums that are open to all interested par- 
ties, not just credentialed members of a 
particular discipline. This openness makes 
it easier for researchers from different fields 
r.nd institutions to collaborate, which may 
broaden the perspective of scholarly com- 
munication. Finally, the trend toward shar- 
ing the research process may well change 
the conception of the research product itself 
into something more multidimensional than 
a traditional document, allowing it to re- 
flect multiple views and opinions. (See the 
section on hypertext and hypermedia later 
in this paper.) Note that the implications 
explored here are not derived from tech- 
nological determinism: The technology it- 
self does not produce such changes. Rather, 
the changes result from the trend toward 
sharing and collaborating, which the tech- 
nology facilitates. 

The trend toward interchange stan- 
dards. True connectivity involves the abil- 
ity to interchange information, which 
requires that information be represented in 
a standard form. The relative youth of in- 
formation science as a field and the rapid 
evolution of computers and communication 



'"For an excellent annotated bibliography of current 
work in CSCW, sec Saul (Irccnbcrg, "An Annotated 
Bibliography of Computer Supported Cooperative 
Work/' 670X7// Hulletm IS (July 1991): 29-62. 
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technology have produced chaotic alterna- 
tives for representing and communicating 
information. This may be unavoidable in a 
field in which technology and paradigms 
are still evolving. By their very nature, novel 
ideas do not always fit into previous pat- 
terns. Similarly, new computational capa- 
bilities often produce new information 
structures that do not easily translate into 
existing standard forms. Furthermore, the 
development of new standards is a slow 
process because it requires compromise and 
consensus. The development of standards 
is therefore a difficult undertaking, and they 
tend to lag behind the latest technological 
advances. Nevertheless, the growing em- 
phasis on interchange standards is a vital 
and worthy trend, without which the prom- 
ise of connectivity cannot be realized. 

Standards are beginning to evolve for text 
(as discussed in the section on Computer- 
Assisted Analysis Achieved Through Con- 
version), and ultimately they will extend to 
graphics, voice, three-dimensional model- 
ing, animation, video, and other media as 
well. In the early stages of this process, the 
goal is to develop usable initial standards 
quickly, without precluding their extension 
and modification in the future. This trend 
toward extensible standards is motivated by 
a recognition of the inevitable lag between 
standards and technological advance. De- 
veloping such extensible standards is a ma- 
jor technical challenge, involving a 
significant effort to translate among differ- 
ent standards and different versions of 
evolving standards. Ideally, such transla- 
tion will minimize the need for the user to 
be aware of the underlying standards, and 
inexpensive computation will provide 
transparent translation among standards 
without user intervention. 

In addition to interchange standards, a 
trend is developing toward defining stan- 
dards and policies for privacy and author- 
ization of access. As collaboration becomes 
more common, it will become increasingly 
important for researchers to be able to pro- 



tect their data, analysis, and results. Pla- 
giarism, theft, tampering, and sabotage will 
undermine the advances of connectivity if 
technical, administrative, and legal solu- 
tions to these problems are not imple- 
mented. Even the computation and 
collaboration processes themselves must be 
protected from unauthorized auditing and 
analysis. Various agencies or individuals 
could easily misuse or abuse knowledge of 
the kinds of questions a researcher asks and 
the thought processes involved in formu- 
lating research. The trend toward increas- 
ing interest in privacy and security issues 
is evidenced in a number of recent confer- 
ences and publications. 38 

A false dichotomy: distributed versus 
centralized control. One of the most in- 
triguing implications of the trend toward 
connectivity is its potential to redefine the 
meaning of control over intellectual arti- 
facts. In particular, the traditional dichot- 
omy between distributed and centralized 
control may no longer be appropriate. This 
dichotomy is based on the natural but out- 
dated assumption that control is a function 
of location in the physical world. Tradi- 
tionally, a resource has been considered to 
be under centralized control if it exists in 
only one physical location and is main- 
tained by agents residing at that location. 
Conversely, a resource is considered to be 
under distributed (decentralized) control if 
it consists of multiple copies or parts that 
are dispersed among multiple locations and 



■"Computers, Freedom and Privacy Conference, 
sponsored by Computer Professionals for Social Re- 
sponsibility, San Francisco Marriott, Burlingamc, Calif. 
25-28 March 1991; The National Conference on 
Computing and Values (NCCV), held at Research 
Center on Computing and Society, Southern Con- 
necticut State University, New Haven, Conn. 12-16 
August 1991; and the seventh Annual Computer Se- 
curity Applications Conference, sponsored by Aero- 
space Computer Security Associates and American 
Society for Industrial Security, and the Association 
for Computer Machinery and the Institute of Electrical 
and Electronics Engineers, St. Anthony's Hotel, San 
Antonio, Tex., 2-6 December 1991. 
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maintained by agents dispersed among those 
locations. This dichotomy applies reason- 
ably well to physical resources, but it fails 
to work for resources created by electronic 
connectivity. 

The physical location of a resource has 
little meaning in the electronic domain. 
Connectivity allows resources to be repli- 
cated and distributed among numerous 
physical locations while behaving as though 
they existed in only one location (and vice 
versa). The key to this phenomenon is the 
separation between an electronic resource's 
physical location and its availability: A da- 
tabase may reside on a storage device in 
one location while being viewed or modi- 
fied via a terminal in another location. Sim- 
ilarly, a database that appears to exist in 
only one location may actually consist of 
pieces distributed and replicated among nu- 
merous locations and may be viewed or 
modified by numerous agents via com- 
puters at different locations. This charac- 
teristic is the definition of connectivity: 
Ac "ess becomes independent of location. 
The notions of centralized and decentral- 
ized (distributed) control simply do not ap- 
ply in this context. New forms of control — 
and policies for when to employ them— are 
likely to evolve as connectivity replaces 
physical access to resources. 

Summary. End-user computing and 
connectivity have been discussed sepa- 
rately here for expository reasons, but their 
full impact lies in their mutual synergy. 
Connectivity elevates end-user computing 
above simple word processing or calcula- 
tion by allowing end-users to access remote 
databases, share information in many dif- 
ferent media and forms, connect to their 
working contexts wherever they are, com- 
municate with their peers, and collaborate 
in all phases of research. End-user com- 
puting in turn provides one of the main mo- 
tivations for improving connectivity: 
Networks do not connect machines, they 
connect people. The combined trends of end- 
user computing and increasing connectivity 



will shape the evolution of research (along 
with many other endeavors) well into the 
next century. 

Specific Technology Trends Affecting 
Scholarly Communication 

The major trends of end-user computing 
and connectivity will manifest themselves 
in many ways. This section identifies a 
number of specific technology trends that 
will superimpose themselves over this 
background. Each subsection discusses an 
area of technology that is expected to have 
a particular impact on research. Although 
not exhaustive, this examination includes 
some of the technology that are likely to 
have the greatest influence over the next 
decade, i.e., artificial intelligence, end-user 
publication and distribution, hypermedia, 
visualization, and virtual reality. 

Artificial intelligence. Current trends in 
artificial intelligence (Al) have the poten- 
tial to affect scholarly research in a number 
of ways. AI may provide intelligent aids 
for analyzing and interpreting sources; au- 
tomated "agents" that can help researchers 
stay abreast of new findings; and tools to 
help formulate research coi,:cpts. AI may 
also enable researchers to model their sub- 
ject areas to test hypotheses. Finally, Al 
has the capacity to produce intelligent tu- 
tors that may help researchers leverage their 
teaching skills. 

The recent commercial success of expert 
systems (and more generally, knowledge- 
based systems) has brought AI out of the 
ivory tower where it had evolved since the 
early days of computing. A number of gen- 
eral-purpose programming languages and 
environments (expert system shells) for 
building expert systems have appeared on 
the market, allowing users with little or no 
forma! training in AI to take advantage of 
some of the most common AI techniques. 
Yet AI encompasses much more than just 
expert or knowledge-based systems. As one 
of the frontiers of computing, it attempts 
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to find ways of using computers to solve 
problems they cannot now solve. AI is 
driven by dual motivations that sometimes 
conflict with and sometimes enhance each 
other. The first of these, which can be 
thought of as a "modeling" motivation, 
seeks to use computers to model and un- 
derstand intelligence. The second, which 
can be thought of as an "engineering" mo- 
tivation, simply seeks to solve difficult 
problems, by whatever means. AI efforts 
that are motivated by modeling tend to fo- 
cus on defining intelligence, understanding 
cognitive processes, and addressing prob- 
lems whose solutions are acknowledged to 
require intelligence. AI efforts motivated 
by engineering simply try to solve difficult, 
worthwhile problems, using any available 
techniques, regardless of whether the tech- 
niques simulate human intelligence. 

Because of these dual motivations and 
because AI is a frontier (and therefore nec- 
essarily dynamic and evolving), it tends to 
include many disparate activities and tech- 
nology, ranging from the automation of 
formal mathematical logic to the design of 
artificial neural networks. Several themes 
run through AI, such as representing 
knowledge, language, and meaning and 
finding relevant patterns or solutions among 
large, complex sets of alternatives. The pri- 
mary influences of AI on scholarly com- 
munication are l"kely to be its ability to 
analyze linguistic and pictorial informa- 
tion, its ability to find patterns, its ability 
to create automated "agents" that act on a 
user's behalf, and its ability to model real- 
ity and formulate, concepts. 

The bulk of scholarly data is currently in 
textual form, and text will undoubtedly 
continue to be the major target of scholarly 
research for some time. Other forms of data, 
such as visual imagery (including draw- 
ings, paintings, photographs of sites or ar- 
tifacts, holograms, and film and video), 
spoken language, sounds, and music may, 
however, play greater roles as the technol- 
ogy for their encoding and analysis im- 



proves. AI software's growing ability to 
understand the semantics (and eventually 
the pragmatics) of language and to analyze 
relationships and identify patterns will make 
it an increasingly attractive tool for per- 
forming scholarly analysis. In addition, AI 
has developed a number of techniques for 
dealing with beliefs and uncertain, contra- 
dictory, or hypothetical information, which 
may help researchers who must often gen- 
erate hypotheses and rely on contradictory 
or uncertain conclusions and beliefs in or- 
der to find patterns and relationships. Cou- 
pled with growing databases of encoded text 
and fast processing, these techniques will 
enable researchers to look for new, unex- 
pected patterns across a wide range of sub- 
ject areas. Similar capabilities eventually 
will extend to visual imagery and sound, 
allowing integrated analyses of text, speech, 
music, and pictorial data. Although it will 
probably be some time before AI will be 
capable of truly understanding literary 
text 39 — and even longer before it will be 
capable of understanding spoken language 
or visual imagery— it is already capable of 
filtering large bodies of text to find literary 
aspects or relationships that are of partic- 
ular interest to a researcher. In this role, 
AI will not replace the analytic insight of 
the researcher, but it will enhance the re- 
searcher's ability to scan large collections 
of information and find patterns worthy of 
analysis. 

One of the major emphases of AI re- 
search has been to develop intelligent agents 
that can behave autonomously on behalf of 
their users. Robots (which are still largely 
experimental) are the most dramatic ex- 
amples of such agents, but another class of 
agents is more relevant to scholarly re- 
search. These are informational agents, such 
as literature-search or SDI (selective dis- 



*'Scc Nancy M. Idc and Jean Vcronis, "Artificial 
Intelligence and the Studv of Literary Narrative," Pe- 
aks 19 (1990): 37-63. ' 
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semination of information) agents, which 
can search for information of interest to a 
researcher, using criteria specified in a form 
similar to a database query. Such agents 
ultimately may perform a number of serv- 
ices, such as translating a researcher's query 
into the form required by particular data- 
bases; periodically repeating a query or 
search; monitoring activity on a network or 
in a database and alerting the user when 
"interesting" events occur; soliciting, col- 
lecting, and filtering information from many 
sources; responding to routine requests from 
other researchers for information or to other 
correspondence; and coordinating the 
schedules and activities of a collection of 
researchers engaged in collaborative effort. 
Such agents will eventually take over many 
of the traditional activities of a secretary: 
They will make up for their relative lack 
of initiative and creativity by being tireless, 
dedicated, and inexpensive. 40 

In addition to its role in the analytic phase 
of research, AI may have an impact on the 
concept formation that leads to research. In 
this earliest conceptualization phase, re- 
searchers often generate informal hy- 
potheses about a subject area, in an attempt 
to define interesting research thrusts. A 
number of tools currently emerging from 
"knowledge acquisition" efforts in AI have 
the potential to help identify viable hy- 
potheses and useful concepts. These con- 
cept-formation tools help the user form 
concepts by asking questions that can dis- 
criminate between examples and counter- 
examples of an evolving concept, based on 
attributes that the user declares as defining 
the concept. For example, a researcher might 
attempt to define a concept such as "ado- 
lescent imagery" in a body of text in terms 



<"For research on intelligent agents, sec Robert E. 
Kahn and Vinton G. Ccrf, An Open Architecture for 
a Digital Library System and A Plan for its De\t'l 
apmcnu The Digital Library Project, Volume I: Ihc 
World of Knowbots (Washington, D.C.: Corporation 
for National Research Initiatives, March 1988). 



of attributes such as age, immaturity, and 
sexual embarrassment. A concept forma- 
tion tool might attempt to find examples of 
such images, asking the user to rate each 
candidate passage according to each attrib- 
ute. Based on these ratings, the tool might 
then show which of these passages appear 
to be examples of the concept and which 
ones appear to be counterexamples, thereby 
helping the user form a consistent and use- 
ful definition of the desired concept. 

Much of AI research focuses on model- 
ing. In order to act intelligently or solve 
complex problems, AI systems often create 
models of reality about which they can rea- 
son or which they can manipulate in order 
to decide how to act in the real world. Tra- 
ditional simulation and mathematical mod- 
eling techniques are severely limited in the 
types of questions they can answer. Sim- 
ulation users, for example, typically spec- 
ify ihe initial state of a simulated world and 
then run the simulation to see what hap- 
pens. This "toy duck" view of modeling 
("wind it up and see where it goes") cor- 
responds to asking questions of the form 
"what if . . . ?" (i.e., what would happen 
if the world were to proceed from this given 
initial state?). This ability to ask "what if 
. . . ?" questions is often touted as the ul- 
timate analytic capability, but many other 
kinds of questions are at least as important 
in many situations. 41 These include such 
questions as: Why did some agent take a 
particular action? Why did a given event 
happen? Can a particular event ever hap- 
pen? Under what conditions will a given 
event happen? Which events might lead to 
a particular event? How can a desired result 
be achieved? Ongoing AI research in this 



4, M. Davis, S. Roscnschcin, and N. Shapiro, Pros- 
pects and Problems for a General Modeling Meth- 
odologv (Santa Monica, Calif.: The RAND 
Corporation, N-180LRC, June 1982); and J. Roth- 
enberg, "The Nature of Modeling," in Artificial In- 
telligence, Simulation, and Modeling, edited by L. 
Widman, K. Loparo, and N. Nielsen, 75-92 (New 
York: John Wiley & Sons, August 1989). 
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area is producing powerful new techniques 
for modeling intentions, causality, goals, 
beliefs, and other phenomena to allow an- 
swering questions that go beyond "what if 
95542 

This trend toward model-based systems 
will provide researchers with techniques for 
conducting experiments, evaluating hy- 
potheses, and exploring alternative inter- 
pretations of reality with minimal cost and 
risk (since they are carried out within a 
computer). As a simple example, sociolog- 
ical or cultural models could be built to 
explore alternative hypotheses about an an- 
cient civilization, using the model to make 
predictions that can be compared with his- 
torical evidence. Al techniques such as these 
may help researchers conceptualize re- 
search as well as perform analyses. 

The modeling capabilities of AI are also 
the key to its use in education. Intelligent 
tutors are an outgrowth of joint research in 
education and°AI; typically, they involve a 
model of the subject matter to be taught (a 
domain model) and a model of the student. 
The domain model elevates an intelligent 
tutor above the level of simple programmed 
instruction because it enables the tutor to 
answer unanticipated questions about the 
subject matter. Students can therefore ask 
a much wider range of questions and pur- 
sue many alternative paths of instruction. 
Similarly, the student model helps the tutor 
determine which concepts the student is 
having trouble understanding. This helps 
the tutor address the student's underlying 
problem rather than simply repeating new 
material or backing up blindly to review 
previous material. Although intelligent tu- 
tois are still largely experimental, they ap- 
pear to hold great promise for improving 



J2 Scc J. Roihcnbcrg, "Using Causality as the Basis 
for Dynamic Models," in Proceedings of the Third 
International Working Conference on Dynamic Mo- 
delling of Information Systems (DYNMOD-3) (Delft, 
The Netherlands: Delft University of Technology, 
1992), 277-92. 



the educational process, particularly for 
students who are self-motivated and self- 
paced. Ultimately, this should allow schol- 
ars to leverage their teaching skills by de- 
veloping tutors that embody their expertise. 

In summary, current trends in artificial 
intelligence may affect scholarly research 
by 

• providing analysis aids that can help 
find and interpret relevant source data, 
text, and other media. 

• creating informational agents that can 
perform some of the routine tasks of 
keeping abreast of new findings, act- 
ing as tireless monitors of develop- 
ments in a field. 

• providing tools to help researchers ex- 
plore, formulate, and refine research 
concepts and hypotheses. 

• enabling researchers to model their 
subject areas to try out hypotheses and 
predict where to find confirming (or 
falsifying) evidence. 

• facilitating the development of intel- 
ligent tutors that can help researchers 
disseminate their knowledge and 
teaching skills to wider audiences. 

Since AI is one of the frontiers of infor- 
mation science, it is also not unlikely that 
additional developments in this field will 
have unforeseen consequences for the ev- 
olution of scholarly research. 

End-user publication and distribu- 
tion. An equally important though less ex- 
otic computing trend is tiie grov ,ng ability 
of end-users to publish and distribute their 
own work. This is already creating alter- 
natives to traditional publication in schol- 
arly journals, not only reducing the time it 
takes to publish research but, more impor- 
tantly, changing the channels of distribu- 
tion, redefining the review process, and 
transforming dissemination by means oi 
electronic connectivity. 

The most prosaic form of end-user pub- 
lication is the production of camera-ready 
printed documents, suitable for publication 
or reproduction and dissemination without 
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further typesetting or layout work (some- 
times referred to as "desktop publishing"). 
Even this simple modernization of the tra- 
ditional publication process has profound 
implications. As with all forms of end-user 
computing, end-user publication involves 
the author of a document much more di- 
rectly in its production. Because of this 
availability of layout and production tools 
during the draft phase, a document ap- 
proaches its final form at an earlier stage 
of development. For example, figures, 
footnotes, and final formatting can be in- 
corporated into early drafts, giving review- 
ers a more readable product and helping to 
eliminate errors and, in general, to improve 
the product. Ideally, the author's control 
over questions of typography, graphics, and 
layout means that the final document rep- 
resents a more accurate and integrated n - 
flection of the author's overail intent. The 
corresponding disadvantage is that authors 
must learn new publication skills, for which 
they may have little inclination, patience, 
or talent. Of course, end-user publication 
does not preclude the use of secretaries, 
graphic artists, or publication specialists to 
reintroduce traditional expertise in the pub- 
lication process, but this intervention tends 
to subvert the advantages of end-user pub- 
lication by slovin* the process and reduc- 
ing the author's control. 

Beyond modernizing the traditional pub- 
lication process, end-user publication al- 
lows authors to publish their work 
electronically, bypassing the production and 
distribution of paper documents entirely. 
Electronic documents can easily reproduce 
most of the desirable attributes of paper, 
and they provide increased flexibility for 
correction, revision, access, and dissemi- 
nation. During the production phase of a 
document, these features facilitate remote 
collaboration and early review and they 
greatly simplify the revision process. End- 
user publication also facilitates a radically 
different view of the research process, in 
which ideas are disseminated for review and 



feedback in the earliest stages of research, 
i.e., prior to documenting or even perform- 
ing 'he research. (Examples of this are dis- 
cussed later in this paper.) 

Electronic dissemination makes use of 
increasing connectivity to bypass tradi- 
tional distribution channels, reduce the cost 
of reproduction and mailing, and enable re- 
cipients of a document to redistribute it by 
forwarding it in electronic form. 43 The 
copyright and other legal implications ot 
electronic dissemination arc only beginning 
to be explored. Similarly, direct, online ac- 
cess to the source of a document makes it 
easier than ever to plagiarize ideas, text, 
and even complex graphics without leaving 
any trace. These problems must be ad- 
dressed by technical, legal, administrative, 
and, ultimately, cultural policies. Such pol- 
icies are likely to evolve more slowly than 
the technology they seek to civilize, leav- 
ing a gap between practice and policy for 
at least the next decade or two; this gap is 
part of the cost of the technological revo- 
lution of scholarly research. 

Hypertext and hypermedia. All re- 
search studies must explicitly or implicitly 
address a number of questions that rep- 
resent different dimensions of inquiry, such 
as What is the problem? What assumptions 
were made about the problem? What re- 
lated research exists? What is original about 
the study? What methodologies were con- 
sidered? What approach or method was 
chosen, and why? What sources and data 
were used? What analysis was performed? 
What were the results? How sru >ld the re- 
sults be interpreted? What other interpre- 



°Computcrs and networks arc being used in (he 
commercial sector as well, both to help automate the 
process of publishing traditional books and journals 
and to develop novel electronic products. This elec- 
tronic publishing industry has so far had little impact 
on end-user publication, but it may be too soon to tell 
whether this industry will ultimately attempt (or man- 
age) to appropriate n»nl commercialize the new chan- 
nels i listribution and dissemination th. end-users 
arc cuhcntly developing for themselves. 
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tations are worth considering? How do the 
results and interpretations depend on the 
researcher's assumptions? What are the im- 
plications of the research? It is difficult to 
answer all such questions without inundat- 
ing and confusing the reader. 

Similarly, presenting complex subject 
matter to students requires answering anal- 
ogous questions about context, background 
history, alternative approaches or formu- 
lations, and relationships to other disci- 
plines. Traditional textbooks and other 
instructional materials seldom address these 
issues adequately. 

Such questions are inherently interre- 
lated and multidimensional. Answering them 
in a strictly sequential, linear fashion is often 
constraining and unrevealing. Yet written 
documents necessarily present their argu- 
ments linearly. In addition, an expository 
sequence that provides insight to one reader 
or audience may not be enlightening to an- 
other. Cross-references, references to other 
documents, repetition, overviews, and 
summaries can ameliorate these problems, 
but only at the cost of redundancy and added 
work for the reader (flipping pages to find 
cross-references or consulting other docu- 
ments). Furthermore, documents, which arc 
inherently static, are hard-pressed to por- 
tray processes or other dynamic phenom- 
ena. The effectiveness of graphics is 
similarly limited by the static nature of the 
printed image. Oral presentations can be 
less linear than documents, can be tailored 
to specific audiences, and arc better suited 
to presenting dynamic phenomena, but they 
are ephemeral and cannot provide the depth 
of the printed word. 

Electronic information technology prom- 
ises to transcend these limitations by deliv- 
ering research results in an interactive, 
electronic form that is nonlinear and mul- 
tidimensional and that integrates written, 
spoken, and graphic media in a permanent, 
dynamic, customizable presentation. The 
terms hypertext and hypermedia suggest the 
novel characteristics of this new approach: 



1. It provides rich, dynamic linkages 
among the elements of a presenta- 
tion. For example, using electronic 
retrieval and display, a reference from 
one item of text to another (whether 
a cross-reference, a bibliographic en- 
try, or a citation in another work) can 
be viewed instantly in a window 
without the user's having to turn pages 
or find another document. Such links 
can be used to present different di- 
mensions of analysis, alternative se- 
quences of exposition, optional 
degrees of elaboration or depth, sup- 
porting evidence, references, data, or 
contextual background. The multidi- 
mensional nature of such structures is 
denoted by the prefix "hyper." Au- 
thors can use this linking to present 
different kinds of information or to 
define alternative paths that generate 
different presentations or variants from 
a single master document. 

2. Hypermedia combines several media 
that currently can be presented elec- 
tronically, such as text, color graph- 
ics, and sound (including voice). 
These can all be linked together as 
easily as text, producing presenta- 
tions that combine the features of 
documents and oral presentations. 

3. these media can be presented dy- 
namically. This allows animating 
graphics, synchronizing voice with 
animation to describe processes, and 
controlling the pace of a presentation, 
as in an oral briefing. 

4. This approach is interactive, allowing 
the reader to control the sequence, 
speed, depth, and focus of the pre- 
sentation, within limits set by the au- 
thor. 

The concept of a nonlinear document 44 



"Although hypertext and hypermedia products arc 
very different from traditional documents, they arc 
generally referred to as "documents" for want of a 
belter word. 
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can be traced back at least as far as the 
seminal paper "As We May Think," by 
Vannevar Bush, in 1945. 45 The electronic 
implementation of this concept is begin- 
ning to transform the traditional notion of 
a document into a multimedia, nonlinear 
form of presentation. The publication of re- 
search results in hypermedia form may make 
them more accessible and more captivat- 
ing, thereby greatly increasing the impact 
and influence of research, particularly out- 
side the traditional scholarly community. 
The result may be greater public recogni- 
tion of policy issues identified by re- 
search—such as the need to preserve historic 
sites or artifacts— in much the same way 
that popular televised documentaries have 
increased public awareness of myriad sci- 
entific, cultural, and environmental issues. 
Furthermore, the use of hypermedia may 
transform the research process itself by 
providing a natural way to represent and 
keep track of interrelated facts, references, 
hypotheses, and arguments, as well as re- 
actions, revisions, and annotations to sup- 
port collaboration. Finally, hypermedia may 
transform educational material into a new, 
multidimensional experience that will cap- 
italize on the exploratory tendencies of 
scholarly students. 

Visualization and virtual reality. Re- 
cent trends in visualization and virtual real- 
ity have the potential to transform the way 
scholarly researchers interact with their data 
and perform their analyses. The world of 
scientific computing has begun to develop 
techniques that allow scientists to visualize 
the results of complex computations. 
Graphic techniques and animation are being 
used to display complex data in ways that 
attempt to make significant patterns leap 
out at the user. Abstract relationships are 
often easier to grasp if they are translated 
into graphical presentations, such as false - 



4 * Vannevar Hush, "As Wc Mav Think," Atlantic 
Monthly 176 (July 1945): 101-OK.' 



color maps, cluster plots, or adjacency 
graphs. These techniques apply equally to 
any field in which complex data, patterns, 
and relationships must be understood. Many 
areas of scholarly communication may profit 
from this technology by visualizing quan- 
titative or qualitative data to gain insight 
into its meaning or to present complex re- 
sults in a perspicuous form. 

Though it is typically viewed as a very 
different trend, the technology of virtual 
reality is closely related to visualization. A 
virtual reality is a simulated world created 
in a computer, using traditional simulation 
or AI modeling techniques such as those 
discussed above. The user "enters" a vir- 
tual reality by wearing a display helmet or 
goggles to create the visual illusion of being 
in the simulated world (e.g., showing dif- 
ferent views as the user's head turns). The 
user interacts with the virtual reality by 
wearing devices such as instrumented gloves 
or suits that sense the user's hand or body 
position, thereby allowing the simulated 
world to react. The result is something like 
an intensified video game, in which the user 
perceives the virtual reality and interacts 
with it for some purpose. 

The power of virtual reality is that it har- 
nesses the user's full sensory a r.d motor 
capabilities in exploring an abstract world, 
rather than relying on more limited facul- 
ties such as reading and typing. Coupled 
with modeling and visualization, this has 
the potential to allow a researcher to inter- 
act intimately with a virtual world created 
out of data or analytic results and to explore 
this world in a much more direct, exper- 
iential way than would be possible by read- 
ing numbers or even by viewing a graphical 
display. In addition to its potential for 
transforming certain aspects of the analytic 
process, virtual reality technology might also 
be of use during concept formation (allow- 
ing researchers to explore abstract spaces 
of concepts, represented as visual worlds) 
or for bringing the education of scholarly 
subjects to life (allowing students to ex- 
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perience subject matter as a virtual world). 
Virtual reality may also be viewed as a log- 
ical extension of hypermedia, in which re- 
search results may be presented as a virtual 
world to be explored, rather than as a doc- 
ument to be seen or heard. 

Caveats. The trends described herein are 
not without their dangers. The legal issues 
surrounding electronic dissemination and 
connectivity have been pointed out above, 
as have some of the possible violations of 
privacy that result from working in an open, 
networked environmen*. Every technolog- 
ical advance has its own risk for misuse, 
whether this risk is legal, ethical, or merely 
a matter of lost productivity and quality. 
For example, the indiscriminate use of end- 
user publication and distribution may by- 
pass carefully established mechanisms for 
editorial and peer review, leading to a pro- 
liferation of low quality, unprofessional 
publications. Similarly, the use of hyper- 
media by authors who are not trained in 
graphic design or media presentation may 
produce a flood of incoherent research 
products whose complexity makes them in- 
accessible to their intended audiences. The 
naive use of modeling tools, visualization 
techniques, and virtual reality may seduce 
researchers into believing results that seem 
compelling despite the fact that they have 
not been validated. Researchers and audi- 
ences alike may tend to accept conclusions 
based on state-of-the-art computations, such 
as AI, with less than the required skepti- 
cism, especially if these computations ex- 
hibit a veneer of intelligence. 

These dangers are real and may well pla- 
gue scholarly researchers for decades to 
come, as they adopt new methods empow- 
ered by technology. Nevertheless, these 
trends appear inevitable and are likely to 
change the form and substance of scholarly 
communication in fundamental ways. 
Whether this change will ultimately im- 
prove the quality of that research is a ver- 
dict that only the future can deliver. 



Summary 

The availability of quantitative data and 
numerical techniques for analyzing them 
have had a marked effect on scholarly com- 
munication over the past several decades. 
The technology trends discussed here, as 
well as others that may prove to be impor- 
tant, are likely to have an even more pro- 
found impact. This impact will do more 
than simply change the work styles of 
scholarly researchers: It will affect their 
thought processes as well, suggesting new 
kinds of research questions and new kinds 
of answers. It will change the way re- 
searchers collaborate and interact with their 
peers and the way they produce their re- 
sults. It will change the form of these re- 
sults, the way they are distributed and 
disseminated, their audiences, and the im- 
pact they have on the research community 
and the public. These changes, already un- 
der way, will have profound implications 
for the information services, libraries, and 
archives that serve the research process. 



SCHOLARLY COMMUNICATION 
AND THE USE OF CURRENT 
INFORMATION TECHNOLOGY 

The previous section explored key trends 
in information technology most relevant to 
scholarly communication. This section 
considers the use of currently available in- 
formation technology by social science and 
humanities scholars to advance scholarship 
and intellectual productivity. The use of 
technology across the full spectrum of 
scholarly communication is considered by 
examining how researchers rely on tech- 
nology to: (1) identify sources, (2) com- 
municate with colleagues, (3) interpret and 
analyze data, (4) disseminate research find- 
ings, and (5) develop curriculums and aid 
instruction. Case examples of scholarly 
practices illustrate broader tendencies within 
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the field. 46 For analytical purposes, the im- 
plications of these practices for archival 
administration are discussed in the final 
section (Conclusion and Recommenda- 
tions) of this report. Although the discus- 
sion focuses primarily on practices in the 
social sciences and humanities, the emerg- 
ing patterns exhibited in these professions 
mirror those found in a broad range of dis- 
ciplines and occupations. 47 

The old assumptions commonly shared 
by archivists and librarians about the re- 
search process characterize a decreasing 
segment of the scholarly community. 48 In- 
stead, a paradigm shift is occurring in the 
research styles of social scientists and hu- 
manists, as in the scientific community, 
where: electronic communication is gain- 
ing prominence; direct online searching is 
replacing intermediary searching; research 
collaborations are becoming more com- 
mon; electronic sources available in homes 
and offices are becoming an alternative to 
reading room visits; source materials orig- 



4f Thc authors do not endorse particular techniques, 
uses of technology, or the validity of results reported 
in the case examples. Because empirical data on 
scholarly use of information technology docs not ex- 
ist, this section relics on case examples intended to 
be illustrative of broader trends within the social sci- 
ences and humanities. Academic computing officers, 
however, arc beginning to recognize the need for such 
data, and some have expressed interest in conducting 
campuswidc or intcrcampus surveys on scholarly use 
of technology. 

47 For a discussion of the impact of information 
technology on the research process of intelligence an- 
alysts sec Michael R. Lcavitt, The Analyst and Tech- 
nology~20Q0 t prepared for the U.S. Intelligence 
Research and Development Council (January 1991). 

4fl lncidding such assumptions as: patrons discover 
source materials essentially through word of mouth 
and through supplemental assistance by intermedi- 
aries; humanities and social science scholars conduct 
research basically as individuals; primary sources, by 
nature, require viewing in reading rooms fortified with 
professional assistance; primary sources arc best stored 
and viewed in their original form or on microfilm; the 
qualitative methods used in the analysis or sources 
typically preclude computation; and the standard 
scholarly products (e.g., publications) arc linear doc- 
uments distributed in print form. 



inally created in print are being converted 
to machine-readable form; standard schol- 
arly research practices are extending to the 
use of artificial intelligence to interpret and 
analyze materials; and electronic publish- 
ing and nonlinear technology, such as hy- 
permedia, are prompting the development 
of new forms of scholarly research prod- 
ucts. The following explores how scholarly 
communication practices among social sci- 
entists and humanists are changing as a re- 
sult of the use of currently available 
information technology. 

Identification of Sources 

According to the professional literature, 
the key way scholars learn about relevant 
research materials is through their col- 
leagues. But in the last few years, word of 
mouth has been supplemented by new forms 
of electronic searching through online pub- 
lic access catalogs (OPACs). For instance, 
most campuses provide academics with di- 
rect access via personal computers to the 
institution's online library catalog. 49 In- 
stead of visiting the library, researchers can 
now explore descriptions of the library's 
holdings from their offices. Furthermore, 
if the institution's catalog proves insuffi- 
cient, scholars can access more than two 
hundred major American library catalogs, 
including those of the universities of Cali- 
fornia, Michigan, Pennsylvania, and Wis- 
consin, via the Internet. 50 For a 
comprehensive search of this nation's li- 
brary holdings, the Research Libraries In- 



49 Sce Clifford A. Lynch, "Library Automation and 
the National Research Network," EDUCOM Review 
24 (Fall 1989): 22; and Communications in Support 
of Science and Engineering, 1-7. 

^Conversation between Avra Michclson and Paul 
Peters, executive director of the Coalition for Net- 
worked Information, 7 May 1991. Clifford Lynch, 
director of library automation for the University of 
California, reports that as many as 30 percent of the 
log-ons to the univcrsitywidc MELVYL library cata- 
log arc from remote sites. 
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formation Network (RUN) is available over 
the Internet, and plans arc under way to 
n ' e OCLC available on the Internet as 
Wei*. 51 Humanist scholars report that, by " 
providing a comprehensive means to browse 
through libraries in their homes and offices 
when convenient, direct access to biblio- 
graphic database*, represents a source of in- 
tellectual empowerment. 52 The use of online 
catalogs probably represents the most 
widespread example of scholarly practices 
in the social sciences and humanities that 
involve end-user computing and connectiv- 
ity. 

Communication with Colleagues 

The search for sources and the need to 
refine intellectual ideas motivate academics 
to communicate with their colleagues. In- 
deed, communication of this sort is fun- 
damental to the advancement of scholarship. 
Beyond the most common methods of com- 
munication (such as face-to-face discus- 
sions, telephone conversations, written 
correspondence, or public presentations) 
scholars are using e-mail and a variety of 
new electronic communication formats de- 
rived from it for academic interchange. 
Scholars naturally still talk to one another, 
but many information exchanges occur 
through network communications rather than 
through oral discourse. 53 E-mail exchanges 
are growing at an asionishing rate, and cur- 



5, Scc Clifford A. Lynch, "The Growth of Com- 
puter Networks: A Status Report," Bulletin of the 
American Society for Information Science 16 (June/ 
July 1990): 10; and Robert Weber, "Libraries With- 
out Walls? 1 ' Publishers Weekly 237 (8 June 1990): 
S20-S22. 

"Stephen Lchmann and Patricia Rcnfro, "Human- 
ists and Electronic Information Services: Acceptance 
and Resistance," College and Research Libraries 52 
(September 1991): 411. 

*\Many argue that although less interactive, for brief 
exchanges e-mail is a far deeper medium for com- 
munication than oral discourse. Unfortunately, there 
arc currently no studies on the nature and extent of 
the use of e-mail among scholars, but its significance 
as a new communication medium is indisputable. 



rently constitute approximately half the 
traffic on research and education net- 
works. 54 The global spread of e-mail has 
been rapid, and it is now possible for 
American scholars to communicate via e- 
mail with colleagues in close to 140 other 
countries. The popularity of e-mail among 
scholars emphasizes the increasing impor- 
tance of network connectivity in the daily 
life of academics. 55 

As an outcome of e-mail, scholars are 
creating new formats for substantive ex- 
change to supplement conventional com- 
munication. For example, nearly thirty 
thousand public-access electronic bulletin 
boards are currently available through re- 
search and education networks. This is up 
from fourteen thousand such applications 
counted one year earlier. 56 BITNET, a net- 
work developed during the mid-1980s to 
provide rapid communication among re- 
searchers, educational institutions, and 
funding agencies, reports more than two 
thousand listservs. Listservs arc discussion 
groups that allow people with common in- 
terests to communicate with one another by 
sending to a special network address mail 
that is automatically distributed to each 
person who has subscribed to a particular 
list. 57 



^Presentation by Paul Peters to a joint meeting of 
the National Association of Gove • nmcnt Archives and 
Records Administrators Conn... .tec on Information 
Technology, and the SAA Committee on Automated 
Records and Techniques, Washington, D.C., 22 April 
1992. 

55 Thc EDUCOM/USC Survey of Desktop Comput- 
ing in Higher Education estimates that 25 percent (ex- 
trapolated figure) of faculty at four year public and 
private universities and colleges use e-mail. Sec Green 
and Eastman, Campus Computing J990, 23. Sec the 
two studies by Tora K. Bikson and Sally Ann Law 
cited in the previous section for studies on the use of 
e-mail within an office environment. 

5A Christophcr Lindquist, "Ferret Lovers Unite and 
Download," Cotvputenvorld 25 (12 August 1991 ): 1 . 

'Theodore J. Hull, SSXA Reference Report, Cen- 
ter for Electronic Records, National Archives and 
Records Administration (June 1991 draft), 2; and Eric 
Thomas, Revised List Processor (ListscrvCufrccpll), 
Release 1.5d, Ecolc Ccntralc dc Paris, from Lis- 
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Of the thousands of electronic discussion 
groups, or conferences, operating on the 
Internet, close to 600 are devoted to schol- 
arly topics in the social sciences and hu- 
manities. 58 The rate of growth of these 
scholarly electronic conferences is aston- 
ishing. From 1990 to 1991, 200 new con- 
ferences were identified on the Internet. For 
the eight months prior to March 1992, an 
additional 150 conferences in the social sci- 
ences and humanities were added to the ex- 
isting directory of listings. 59 Scholars have 
established conferences in virtually every 
field within every discipline. For example, 
there are conferences on topics such as Hel- 
lenic culture, folklore, modern British and 
Irish literature, the Vietnam War, and eigh- 
teenth-century world history. There are 
conferences devoted to the study coun- 
tries or regions, such as Peru, Iberia, Latin 
America, and the Baltic states. There are 
conferences on the works of single authors, 
such as James Joyce, John Milton, Thomas 
Pynchon, and Hegel, and there are confer- 
ences devoted to concepts such as libertar- 
ianism, intuition in decision making, ethics, 
and fraud in science/' 0 

The Humanist, an electronic conference 
established several years ago serves as a 
focal point for discussions of humanities 
computing techniques and research meth- 



tscrv@indycms.iupui.edu (13 June 1991 17:55:18), 
1. 

58 Diane Kovacs, Directory of Scholarly Electronic 
Conferences, 3rd cd. (Kent State, Ohio: Kent State 
University Libraries, August 1991), [available on Bit* 
net/Internet at Listserv@kcntvm or FTP from 
ksuvxa.kcnt.edu. The directory, an indispensable, 
growing resource, is also available in print as Direc- 
tory of Electronic Journals, Newsletters and Aca- 
demic Discussion Lists by Michael St range love and 
Diane Kovacs, edited by Ann Okcrson, 2nd cd. 
(Washington, D.C.: Association of Research Librar- 
ies, March 1992). The conference figures cited reflect 
updated information that Diane Kovacs was kind 
enough to share with Avra Michclson. 

v Thcsc figures may undcrrcprcscnt actual scholarly 
activity, as Kovacs warns that the directory's cover- 
age of Usenet is less than comprehensive. 

M 'For a description of these conferences, sec the 
Kovacs directory cited earlier. 



ods. It also broadcasts announcements and 
includes a column for ongoing queries and 
responses that cover a broad range of issues 
of interest to humanist scholars. The Hu- 
manist is transmitted daily to about two 
thousand readers, including subscribers in 
Europe and the Near East. 61 A British 
counterpart, Humanities Online Bulletin, 
operates as a forum for humanists to ex- 
change experience, solicit advice and in- 
formation, notify one another of projects, 
review publications, and make announce- 
ments. The almost thirteen hundred regis- 
tered readers arc mostly members of 
humanities departments in British univer- 
sities. 62 These electronic discussion groups 
serve a unique role in scholarly communi- 
cation in that they permit the rapid inter- 
change of current information, ideas, and 
perspectives. No other medium has per- 
mitted scholars to communicate with an in- 
ternational group of peers quickly and 
effortlessly at the front end of the research 
process. (The scholarly implications of the 
new exchange mediums are examined fur- 
ther in the section Dissemination of Re- 
search Findings later in this report.) 

Interpretation and Analysis of Sources 

The use of information technology to as- 
sist in interpreting and analyzing data rep- 
resents one of the most important paradigm 
shifts toward end-user computing in schol- 
arly research practices. Scholars are both 
converting primary textual sources to ma- 
chine-readable form to allow for conven- 
tional computational processing and using 
artificial intelligence to do new types of 
machine-assisted interpretation and analy- 



M Elainc Brcnnan and Allen Rcncar arc the current 
co-cditors of the Humanist. Information on the Hu- 
manist from a telephone conversation between Avra 
Michclson and Allen Rcncar, 17 December 1990, and 
from th; description of the conference that appears in 
the Kovacs directory. 

^Brendan Loughridge. "Information Technology, 
the Humanities and the Library," Journal of Infor- 
mation Science 15 (July-September 1989): 280. 
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sis. The use of computing to perform inter- 
pretation and analysis is a developmental 
trend with broad implications for scholar- 
ship. These practices suggest fundamental 
changes in scholarly methods, and each will 
be examined in depth. 

Computer-assisted analysis achieved 
through conversion* Social scientists and 
humanities scholars use both quantitative 
and qualitative methods to analyze and in- 
terpret sources. Typically, the search for 
and evaluation of evidence involves both 
types of methods. At one end of the con- 
tinuum, quantitative analysis involves the 
use of mathematical processes such as a 
count of frequencies and distributions of 
occurrences, or higher level statistical tech- 
niques. At the other end of the continuum, 
qualitative analysis typically involves non- 
mathematical processes oriented toward 
language, interpretation, or the building of 
theory. 63 

Scholarly analysis often involves the 
processing of large and sometimes massive 
amounts of textual sources. 64 But research- 
ers have discovered that many of the meth- 
ods of interpretation and analysis related to 
both quantitative and qualitative methods 
are processes that can be performed by 
computers. For example, computers can 
count (e.g., they can count words, births, 
deaths, marriages, commercial activity, and 
even brush strokes used in a Rembrandt 
painting). Computers can perform regres- 
sion analysis to suggest cause and effect 
relationships. Through the use of advanced 
technology, computers can perform pattern 
recognition, do semantic analysis, analyze 
text, and model concepts. And computers 
can perform these processes faster, over 
more sources, and with greater precision 



M Nigcl G. Fielding and Raymond M. Lee, cds., 
Using Computers in Qualitative Research (London: 
Sage Publications, 1991), 4. 

M Thc use of nontextual sources of evidence, such 
as photographs, film footage, artifacts, and sound re- 
cordings is significant as well. 



than scholars who must rely on manual 
interpretation of data. 

But if computers are to be used for these 
purposes, source materials must be in ma- 
chine-readable form. For this reason, many 
scholars, once they have identified the key 
sources for their research, are converting 
them to machine-readable form so that they 
are in a form amenable to computer-as- 
sisted analysis. 65 

Scholarly conversion of sources to ma- 
chine-readable form has been occurring for 
at least forty years. At first the practice was 
generally limited to numeric data. But in 
more recent years, the scholarly appetite 
for machine-readable data has extended to 
text as well. Textual conversion projects 
undertaken by individual scholars or under 
the auspices of academic institutions are far 
more prevalent than one might expect, es- 
pecially in the fields of linguistics, classics, 
religion, and even history. The Center for 
Electronic Text in the Humanities estimates 
that there are currently eight thousand se- 
ries of converted electronic text. 66 The con- 
version efforts among scholars are an 
example of the manifestation of end-user 
computing, in an effort to store, retrieve, 
manipulate, and analyze large amounts of 
sources in electronic form. The availability 



ft5 Scc Avra Michclson, ''Forecasting the Use of 
NREN by Humanities Scholars,** paper presented at 
the panel "New Constituencies for the NREN,*' 27 
March 1992, National NET *92, Washington, D.C. 
Available electronically on the Coalition for Net- 
worked Information filcscrvcr. 

Contact craig@cni.org for transfer information. 

^Conversations between Avra Michclson and Mar- 
ianne Gaunt, Center for Electronic Texts in the Hu- 
manities, Rutgers University, on 30 October 1990, 
and 14 May 1991. Rutgers and Princeton universities 
recently announced the creation of the jointly spon- 
sored Center for Electronic Texts in the Humanities 
to respond to the information needs of a new gener- 
ation of scholars. The center will develop an inter- 
national inventory of machine-readable textual source 
materials, provide catalog entries through the Re- 
search Libraries Information Network (RLIN), and ul- 
timately make electronic textual source materials 
available to researchers on research and education net- 
works. 
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of an electronic corpus of sources on a topic 
encourages new types of questions to be 
asked and hypotheses to be explored. 

The conversion of papfcr-based textual 
source materials to machine-readable form 
occurs worldwide. The earliest American 
conversion project, the Thesaurus Linguae 
Graecae (TLG), was founded in 1972 by 
Theodore F. Brunner at the University of 
California at Irvine to create an electronic 
data bank of extant ancient Greek texts from 
the period of Homer (ca. 750 B.C.) through 
about A.D. 600. The massive electronic file 
is used by researchers in Greek language 
and literature, linguistics, ancient history, 
philosophy, and religious studies to access 
Greek texts and related documents in full 
text. In conjunction with the American 
Philological Association, many members 
of the classicist profession participate in the 
ongoing compilation. Today the TLG is an 
immense, growing database of more than 
eight thousand works of classical Greek lit- 
erature stored on CD-ROM, copies of which 
are available at two hundred locations in 
this country and abroad. 67 

Another conversion effort, the American 
and French Research on the Treasury of the 
French Language (ARTFL) draws on the 
work of the French government since 1957 
to create a new dictionary of the French 
language. In conjunction wi'h the devel- 
opment of the dictionary, the French de- 
veloped an electronic database of 
approximately 150 million words derived 
from major literary and philosophical works 
and scientific and technical texts. For in- 
stance, the auxiliary database contains the 
novels of prominent and popular authors, 
correspondence, literary criticism, an ex- 
tensive collection of poetry and theater, 



travelogues, biographies, historical works, 
political documents, biblical commentary, 
philosophical and economic essays, and 
writings on biology. 

In 1979, the National Endowment for the 
Humanities (NEH) granted funds to the 
University of Chicago to conduct a survey 1 
of North American French literary scholars 
and historians whose work focused on the 
eighteenth to twentieth centuries. The pur- 
pose of the survey was to evaluate the po- 
tential usefulness of the ARTFL database 
to their work. Based on the scholars' en- 
dorsement, France deposited the corpus of 
fifteen hundred machine-readable texts at 
the University of Chicago in 1982. After 
the database was restructured to allow for 
text analysis, the electronic materials were 
made available to researchers. As an on- 
going project at the University of Chicago, 
scholars continue to augment the database 
with, for example, a collection of trouba- 
dour poetry estimated to include 65 percent 
of the genre's extant poems; a collection 
of texts from the 1848 revolution, includ- 
ing radical newspaper articles, pamphlets, 
posters, speeches, and manifestos by pro- 
letarian leaders; and a collection of sev- 
enteenth-century French theater pieces. 

A variety of scholars use the ARTFL, 
including Keith Baker, a University of Chi- 
cago historian of ideas. Baker's research 
concerns the attempt to redefine traditional 
terms during the Enlightenment to conform 
with the new political and social order. Ac- 
cording to Baker, the "advantage of the 
ARTFL Project is that it provides a broad 
basis for systematic analysis of . . . key 
terms," 68 such as the occurrence of im- 
portant political phrases like "opinion pub- 
lique" in eighteenth-century texts. 69 Other 



ft7 For information on TLG, sec Theodore F. Brun- 
ner, "Data Banks for the Humanities: Learning from 
Thesaurus Linguae Graecae,** Scholarly Communi- 
cation 7 (Winter 1987): 1, 6-9; and David S. MialPs 
"Introduction,** in Humanities and the Computer: New 
Directions (Oxford: Clarendon Press, 1990), 5. 



**Alicc Musick McLean, Robert Morrisscy, and 
Donald A. Ziff, "ARTFL: A New Tool for French 
Studies,** Scholarly Communication 8 (Spring 1987): 
8. 

ft "For another example of research devoted to the 
historical analysis of language, sec Mark Olscn and 
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instructors have used ARTFL to teach 
French nationalism or to study the literary 
myth of Charlemagne as recorded from the 
middle ages to the nineteenth century. 
ARTFL is available online to scholars and 
students at institutions that participate in an 
inter-university fee-based consortium. 70 

A third large file, the Medieval and 
Modern Data Bank (MEMDB) was founded 
in 1982 at Rutgers University by Rudolph 
M. Bell and Martha C. Howell to establish 
an electronic library for medieval and early 
modern historians. The data bank consists 
of text descriptions of currency exchange 
rates, including a master data set of tabular 
works concerning medieval and early mod- 
ern history. More than thirteen thousand 
medieval currency exchange quotations from 
the mid-twelfth century to 1500 A.D. are 
available, covering Europe, Byzantium, the 
Levant, and North Africa. MEMDB is a 
growing database; plans for expansion in- 
clude adding taxation records, wills and in- 
ventories, parish records, vital statistics, 
comnany records, import/export records, 
household/estate accounts, paleopathol- 
ogy studies, and such reference aids as 
glossaries of weights and measures, gaz- 
etteers of Latin and vernacular place names, 
and calendars of dates. The Research Li- 
braries Group (RLG) is preparing a CD- 
ROM version of MEMDB for release. 71 



Louis-Georges Harvey, "Computers in Intellectual 
History: Lexical Statistics and the Analysis of Polit- 
ical Discourse, " Journal of Interdisciplinary Histon 1 
18 (Winter 1988): 449-64. 

70 McLcan, ct.al., "ARTFL," 1, 6-9. 

7, Information reported in a phone conversation by 
Marianne Gaunt, the Center for Electronic Texts in 
the Humanities, Rutgers University to Avra Michcl- 
son 14 August 1991. Sec also Loughridgc, "Infor- 
mation Technology, " 281; and Rudolph M. Bell 
(Rutgers University), "User Perspectives and Re- 
quirements: Creator of Non-bibliographic Databases 
Has to Share with Others," unpublished paper pre- 
sented to the Library of Congress Network Advisory 
Committee Meeting, 29-31 March 1989, Washing- 
ton, D.C.; also information reported by Rudolph Bell 
to Avra Michclson in a phone conversation 13 No- 
vember 1991. 



The TLG, ARTFL, and MEMDB rep- 
resent discipline-specific electronic com- 
pilations, but many smaller and often more 
diverse humanities conversion projects also 
exist. 72 For instance, under the direction of 
Robert Hollander at Princeton University, 
the Dante Project converted to electronic 
form the complete text of sixty commen- 
taries on Dante in Italian, Latin, and Eng- 
lish. Before the Dante conversion project, 
many of these works were unavailable in 
the United States. 73 The purpose of Vic- 
toria Kirkham's Penn Boccaccio Project at 
the University of Pennsylvania is to de- 
velop an electronic archives that establishes 
links between the author's writings and the 
seven thousand illustrations of his work that 
were created contemporary to his lifetime 
in the fourteenth century through the six- 
teenth century. 74 

Several archival conversion projects are 
under way in England. For example, the 
Brotherton Library is compiling a complete 
database of its seventeenth and eighteenth 
century manuscript verse. The University 
of York History Department initiated a joint 
effort with the York Archaeological Trust 
both to develop a computerized database of 
the town's title deeds and to create a re- 
construction of the region's topographical 
evolution between the twelfth and sixteenth 
centuries. At the University of Southamp- 
ton, scholars are developing an online da- 
tabase of the papers of the first Duke of 
Wellington. 75 

At Bar-IIan University in Israel, Yaacov 
Choucka is constructing a Jewish culture 



"Georgetown University's Cc n tcr for Text and 
Technology has compiled a database of descriptions 
of more than three hundred conversion projects, many 
of which comprise hundreds of scries. 

"Constance Gould, Information Needs in the Hu- 
manities: An Assessment, Prepared for the Program 
for Research Information Management of the Re- 
search Libraries Group, Inc.. Stanford. Calif.: 1988, 
27. 

74 lbid., 27. 

"Loughridgc, "Information Technology," 281. 
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database, the Global Jewish Database/Res- 
ponsa Project. This database includes about 
fifty thousand rabbinical answers to ques- 
tions about Jewish life and culture, the Ba- 
bylonian Talmud, Midrash literature, 
medieval commentaries, Maimonides Code, 
and the full text of the Hebrew Bible. When 
completed, the database will contain full 
text of nearly all written Hebrew works up 
to the tenth century as well as about one 
thousand major sources on Jewish cul- 
ture. 76 At the University of Pennsylvania, 
Robert Kraft and John Abercrombie, work- 
ing in conjunction with the Packard Hu- 
manities Institute, issued a CD-ROM 
containing at least ten versions of the Bible 
as well as a dictionary of New Testa.nent 
Greek, classical Latin texts, Greek inscrip- 
tions, and various texts including Sanskrit 
sources. 77 

* England's Oxford Text Archive (OTA) 
is a large repository of machine-readable 
text and includes text bases in more than 
twenty-five languages. Recently it served 
as a key source for a dissertation on Jane 
Austen's novels. 78 In Pisa, Italy, the Isti- 
tuto di Linguistica Computazionale, one of 
the oldest and largest repositories of ma- 
chine-readable classical and modern texts, 
has converted an extensive variety of ma- 
terials, including Italian newspapers and 
periodicals, modern novels and poetry, and 
works of nonfiction. 79 Similarly, the Brit- 
ish Domesday project assembles a variety 
of textual and visual information on con- 
temporary Great Britain. 80 



76 Gould, Infornation Needs in the Humanities* 39, 
and Loughridgc, "Information Technology/* 280. 

77 Gould, Information Needs in the Humanities, 38. 

7K Thc Center for Electronic Texts is cataloging the 
records of the OTA. The catalog is made possible 
through an NEH grant, and the center is describing 
the approximately eight hundred rccods that comprise 
the OTA and making the description* available through 
RLIN. For information on the OTA, sec Miall, cd., 
"Introduction," in Humanities and the Computer, 5, 
and Gould, Information Needs in the Humanities, 27. 

""^Gould, Information Needs in the Humanities, 27. 

m 'Miall, "Introduction/* in Humanities and the 
Computer: New Directions, 5. 



Other conversion efforts involve do- 
mains such as Italian Renaissance music 
and lyric poetry, Spanish texts, medieval 
medicine-related drawings and illustra- 
tions, the papers of Charles Sanders Peirce, 81 
and the works of literary greats such as 
Shakespeare, Shelley, Faulkner, and Mil- 
ton. 82 Besides these institution-based con- 
version efforts, hundreds of smaller projects 
are addressing the needs of particular teams 
of researchers. Humanities scholars predict 
that the millions of words of text already 
available in machine-readable form rep- 
resent only a minute fraction of source ma- 
terials to be converted in the next ten to 
fifteen years. 83 Scholars contend that the 
reuse of textual databases by those other 
than the original converters will soon— if 
it does not already— constitute the predom- 
inant use. 

In an effort to compile a massive elec- 
tronic text corpus that will serve as a com- 
prehensive research resource, language 
scholars have initiated the Data Collection 
Initiative (DCI). Sponsored by the Asso- 
ciation for Computational Linguistics, the 



R1 This new effort involves a consortium that in- 
cludes the current documentary editing project on Peirce 
centered at Indiana University-Purdue University, 
working with the philosophy departments at Harvard 
and Texas Tech universities, Georgetown Universi- 
ty's Center for Text and Technology, Brown Univer- 
sity's Computing and Information Services, and George 
Washington University's Department of Communi- 
cation. The consortium plans to convert Pcircc's large 
print manuscript collection housed at the Houghton 
Library, along with secondary commentaries on Pcircc's 
work, to machine-readable form. The database would 
also include provisions for electronic scholarly com- 
munication on the vastly interdisciplinary work of 
Peirce. 

"References appear on the database of electronic 
texts compiled by Georgetown University's Center for 
Text and Technology. 

^Association for Computers and the Humanities, 
the Association for Computational Linguistics, and 
the Association for Literary and Linguistic Comput- 
ing, "Proposal for Funding for an Initiative to For- 
mulate Guidelines for the Encoding and Interchange 
of Machine-Readable Text," unpublished proposal 
prepared for the National Endowment for the Human- 
ities, 19S8, 12. 
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DCI is the most extensive international col- 
laboration of its kind. The ultimate goal of 
the project is to develop a global electronic 
library of text available for online research, 
primarily to serve the needs of computa- 
tional linguists. Coordinated by Mark Lib- 
erman at the University of Pennsylvania's 
Department of Linguistics, the DCI in- 
cludes a broad sample of materials, such 
as the archives of the Challenger investi- 
gation commission, which constitutes about 
2.5 million words of deposition and hear- 
ings transcripts; portions of the Library of 
America volumes; 200,000 U.S. Depart 
ment of Energy scientific abstracts; U.S. 
Department of Agriculture Extension Ser- 
vice fact sheets; the Federalist Papers; the 
King James Bible; computing journals; and 
sample correspondence and dictionaries. 8 * 
Besides acquiring a large corpus of elec- 
tronic text, scholars are developing encod- 
ing standards for documents, to ensure that 
converted files can be read on a variety of 
computers and software. The Text Encod- 
ing Initiative (TEI) is a collaboration among 
the Association for Computers and the Hu- 
manities, the Association for Computa- 
tional Linguistics, and the Association for 
Literary and Linguistic Computing, which 
received funding from the National Endow- 
ment for the Humanities (NEH), the Eu- 
ropean Economic Community, and the 
Mellon Foundation to determine the ele- 
ments and the methods for encoding ma- 
chine-readable text for electronic 
exchange. 85 The first phase of funding is 



"'Information on the DCI from phone conversations 
between Avra Michclson and Don Walker, Bellcore 
(14 May 1990), and Mark Libcrman, AT&T Bell Lab- 
oratories (5 June 1990); sec also Mark Libcrman, 
"Report to the ACL Executive Committee on the ACL7 
DCI," (5 June 1990). 

85 Somc scholars consider questions of what to en- 
code as serious a concern as how to encode. For in- 
stance, should encoding indicate the physical condition 
of a document by marking the presence of ink spots, 
water stains, brittlcncss of paper, etc.? 



devoted to the needs of literary, linguistic, 
and text-oriented historical research. 86 

The TEI encoding standards closely fol- 
low the International Standards Organiza- 
tion's standard ISO 8879, the Standard 
Generalized Markup Language (SGML). 
This interchange format specifies how to 
encode (or mark up) texts so that they can 
be shared in a machine- and software-in- 
dependent form by different research proj- 
ects for different purposes. The TEI 
encoding standards use delimiters and tags 
to distinguish markup from text and to ex- 
press specific information about the format 
of a document. 87 A draft version of the TEI 
standards is circulating to scholars and in- 
dustry for review. 88 

The extraordinary projects under way by 
scholars to convert source materials to ma- 
chine-readable form, assemble an elec- 
tronic corpus of textual data, and establish 
data format standards for the interchange 
of text are in essence efforts aimed at fa- 
cilitating end-user computer-assisted analy- 
sis of sources within the social sciences and 
humanities. 

Computer-assisted analysis with arti- 
ficial intelligence. Some scholars are con- 



HA Association for Computers and the Humanities el 
al., Proposal for Funding, 59. 

H7 SGML is a standard set of instructions for com- 
posing machine-readable tag sets and grammars. SGML 
applications, such as the TEI guidelines, establish tags 
and delimiters for the interchange of all types of text, 
including rules for encoding many types of document 
structures and data elements. The encoding allows 
computers, using appropriate software, to **rcad M the 
structure of a document (e.g., to know that an an- 
thology of poems contains individual poems and that 
each possesses a title, stanzas, and lines), and to pres- 
ent it as such to the user; for further explanation of 
SGML, sec C. M. Spcrbcrg-McQuccn and Lou Bur- 
nard, cds. Guidelines for (he Encoding and Inter- 
change of Machine-Readable Texts (Chicago, Oxford: 
Text Encoding Initiative Version 1.1, October 1990). 

ftR Scholars in Europe have formed the History 
Working Party, a subgroup of the Text Encoding In- 
itiative to ensure that TEI encoding guidelines address 
the needs of historians (e-mail via Internet from Don- 
ald A. Spaeth to Avra Michclson, 9 August 1991). 
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verting records to machine-readable form 
so that artificial intelligence (AI) can be 
used to assist in data interpretation and 
analysis. The use of AI in scholarly re- 
search signals a new phase in social science 
and humanities end-user computing. 89 As 
early as 1986, a panel of specialists brought 
together by the National Science Founda- 
tion reported that AI methods held great 
promise for research in the social sciences, 
especially in relation to the analysis and 
interpretation of complex situations, re- 
search design, and theory formation. 90 
Within *he humanities, scholars contend that 
the ability to process incomplete and in- 
consistent data with software that supports 
uncertainty and changes in beliefs makes 
AI uniquely suitable for many research ef- 
forts. 91 

In the area of AI, political scientists cur- 
rently are the most sophisticated experi- 
menters outside the hard sciences. Their 
prominence with AI calls to mind their ear- 
lier role as the pioneer users of computa- 
tional processing with electronic numeric 
data. They are using artificial intelligence, 
especially in the area of international rela- 
tions, to model decision making for the study 
of "deterrence, escalation control and war 
termination. " 92 The applications involve the 
choices defense programs and military op- 
erations confront during peace, as well as 
methods for evaluating choices during a 



89 Scc, for instance, Mi all, Humanities and the 
Computer, 2; or for an earlier discussion, E. Casctti 
ct. al., "Regarding the Feasibility and Desirability of 
Conferences on The Methodological Research Fron- 
tiers and the Social Sciences,* " Final Report to the 
National Science Foundation (NSF Award No.: OIR 
8406230), 10 September 1986, 13. 

w Casctti, ct al., "Regarding the Feasibility and De- 
sirability of Conferences, " 13. 

v, Miall, Humanities and the Computer, 6. 

y2 Scc Paul K. Davis, "A New Analytic Technique 
for the Study of Deterrence. Escalation Control, and 
War Termination,** in Artificial Intelligence and Na- 
tural Security, edited by Stephen J. Cimbala (Lcx- 
1 .,ion, Mass.: Lexington Books, 1987), 35-60. 



conflict. Tney explicitly address compli- 
cations that decision bikers face, such as 
conflicting principles and objectives, ill- 
defined alternatives, the complexity of 
problems, and the pervasive uncertainty of 
assumptions. As a result of intensive work, 
some existing prototypes are evolving into 
more advanced applications. 93 

Besides the "conflict-oriented" proj- 
ects, other examples include AI prototypes 
that interpret Sino-Soviet negotiating ses- 
sions, 94 recognize patterns over a large, 
complex data set of historical events for 
purposes of prediction, 95 and generate hy- 
potheses by exploring data to induce rules. 
One application of this last type analyzes 
the factors that influence different satisfac- 
tion levels of state legislators with legisla- 
tive outcomes. The developers contend that 
the existing application can be adapted for 
use with similar research questions. 96 

The discipline's innovators argue that AI 
techniques should be considered "standard 
components in every political scientist's tool 
kit." 97 In making their case, they argue 
that many foreign policy questions rep- 
resent suitable AI applications, such as the 
degree to which the Soviet economy de- 
clined under Brezhnev or the impact of the 
development of a navy on China's foreign 
policy. 98 One of the discipline's journals, 



"Ibid., 35-55. 

94 Scc William dcB. Mills, "Rule-Based Analysis 
of Sino- Soviet Negotiations,*' Social Science Com- 
puter Review c 8 (Summer 1990): 181-95. 

g5 Philip A. Schrodt, "Pattern-Matching. Set Pre- 
diction, and Foreign Policy Analysis," in Artificial 
Intelligence and National Security, 89-107. 

W G. David Garson, "The Role of Inductive Expert 
Systems Generators in the Social Science Research 
Process," Social Science Microcomputer Review 5 
(Spring 1987): 11-18. 

97 William dcB. Mills, "Rule-Based Analysis," 182; 
and Paul A. Anderson, "Using Artificial Intelligence 
to Understand Decision Making in Foreign Affairs: 
The Problem of Finding An Appropriate Technol- 
ogy/' in Artificial Intelligence and National Security, 
133. 

,,K Scc also, for instance, the ten or so articles in 
Artificial Intelligence and National Security. 



38 



270 



American Archivist / Spring 1992 



Social Science Computing Review, keeps 
readers current on AI software with regular 
reviews of expert systems shells. 

Unlike ^political scientists, most histori- 
ans using AI tend to apply it to a narrower 
range of research questions. The chief use 
of AI in historical research is in applica- 
tions designed to build nominal record link- 
ages to reconstruct the population history 
of past societies." This technique is usu- 
ally used with family and community re- 
construction, an area of study already quite 
computer-oriented. Nominal record linkage 
involves the analysis of parish and census- 
like records to reconstruct individual ident- 
ities and relationships among individuals. 
It is a complex process that requires much 
interpretation because of the prevalence of 
homonic names (multiple names, with the 
same sound and often the same spelling, 
which refer to different people), name var- 
iations, and the need to link evidence re- 
lated to the same individual from separate 
records. Historians typically consider an 
individual's vital dates, residence, profes- 
sion, filiation, and other available data to 
decide whether several pieces of evidence 
refer to the same person. 

Nominal record linkages typically in- 
volve analysis of a large and diverse set of 
records. Once the records of an individual 
have been linked, then a similar process 
must be performed to link the records of 
families and, ultimately, of communities. 
Historians are finding, however, that AI can 
be used to perform some interpretations as- 
sociated with the task. For example, in 



France, historians at the Institut de Re- 
cherche et d'Histoire des Textes are using 
expert systems technology to identify un- 
ambiguously individuals, based on thir- 
teenth and fourteenth century parish 
registers. 100 Similarly, the Cambridge Group 
for the History of Population and Social 
Structures has been using artificial intelli- 
gence for both nominal records linkage and 
to disambiguate household relationships. In 
the Cambridge project, AI performs some 
of the rudimentary aspects of analysis but 
still leaves the hard questions of interpre- 
tation to the historians. Kevin Schurer, a 
member of the group, describes it this way: 

The study of history should be 
driven by theory rather than fact. AI 
techniques may help historians to ex- * 
amine the relationship between facts 
more closely, and may add to the un- 
derstanding upon which interpreta- 
tions are made, yet they can never 
act as a substitute. In the examples 
given, expert systems may help us to 
determine the degree of household 
complexity in the past, or the levels 
of fertility. They may "positively" 
identify that females married on av- 
erage at age 24 and had a completed 
family size of between five and six 
at the beginning of the 19th century, 
compared to an average age of 26 and 
a completed family size of around 
three at the end of the century, yet it 
is the task of the historian to theorize 
why this transition occurred. 101 



V9 Scc Kevin Schurer, "Artificial Intelligence and 
the Historian, Prospects and Possibilities" in Inter- 
pretation in the Humanities: Perspectives from Arti- 
ficial Intelligence, Library and Information Research 
Report no. 71, edited by Richard Ennals and Jean- 
Claude Gardin, 169-95 (Cambridge: Cambridge Uni» 
vcrsity Press, 1990); and Joaquim CarvaSho, "Expert 
Systems and Community Reconstruction Studies," 
History and Computing //, edited by Peter Den Icy CL 
al. (Manchester: Manchester University Press, 1989), 
97-102. 



Although the primary use of AI among 
historians has been to reconstruct kinship 



'""Caroline Bourlct and Jcan«Luc Mincl, "A Dec* 
larativc System for Setting Up a Prosopographical Da- 
tabase," in Histoty and Computing, edited by Peter 
Dcnlcy and Dcian Hopkin (Manchester, England: 
Manchester University Press, 1^7), 190. 

""Schurer, "Artificial Intelligence and the Histo- 
rian/' 190. 



39 



Scholarly Communication and Information Technology 



271 



and community relationships, other uses also 
are being explored. For instance, French 
social historian Beatrice Henin developed 
a computer file of leasehold documents cre- 
ated by notaries and property inventories 
taken at the time of death to study seven- 
teenth-century Marseilles. Toward the end 
of her research, Henin became interested 
in the interior decor of houses from differ- 
ent social classes. Her use of artificial in- 
telligence to analyze textual descriptions of 
pictures on the walls of rooms, largely with 
religious themes, led he r Jevelop a new 
model for understai. .ng Protestant and 
Catholic families in seventeenth-century 
England. 102 

Another European effort, the RESEDA 
Project, uses AI to respond to historical 
questions from a biographical database of 
French public and private figures during the 
fourteenth and fifteenth centuries. In ad- 
dition to biographical information, the da- 
tabase contains abstract data about 
individuals, such as their beliefs, inten- 
tions, opinions, and mental attitudes. Using 
a hypotheses template, the system sorts the 
information to discover relevant facts, and 
infers information from the data to answer 
questions involving conjecture. 103 

In yet another type of project a scholar 
is using AI to extend the findings of Tzve- 
tan Todorov's The Conquest of America: 
The Question of the Other (1985). Todo- 
rov's work concerns the use of "signs and 
communication (and failed communica- 
tions) whhin the cultural encounter." 104 Jim 



102 Richard Ennals, Artificial Intelligence: Applied* 
tiens to Logical Reasoning and Historical Research 
(Chichester, England: Ellis Horwood Limited, 1985), 
125. 

l0? Gian Piero Zarri, "Artificial Intelligence and In- 
formation Retrieval: A Look at the RESEDA Proj- 
ect," in The Analysis of Meaning: Informatics 5, edited 
by Maxinc MacCaffcrty and Kathleen Gray (London: 
Queens College Oxford, 1979), 166-72. 

,,M Jim Doran, "A Distributed Artificial Intelligence 
Reading of Todorov's The Conquest of America: The 
Question of the Other, by Tzvctan Todorov, 1985/' 
in Ennals and Gardin, Interpretation in (he Humani- 
ties, 166. 



Doran at the University of Essex uses AI 
to add another dimension to Todorov's 
analysis by analyzing the belief systems and 
their impact on the behavior of the key per- 
sons and cultural groups examined in To- 
dorov's book. Using evidence for beliefs 
already embedded in Todorov's work, Doran 
furthers the analysis by systematically ex- 
amining the relationship between the be- 
liefs and the conquest of America. This effort 
suggests one way in which scholars are ex- 
ploring the use of artificial intelligence to 
extend an existing analysis of source ma- 
terials. It uses AI to examine the relation- 
ship between reasoning and beliefs, to 
categorize "faulty" belief systems, and to 
consider metabeliefs— beliefs about be- 
liefs. 105 

In the field of history, the principal in- 
vestigators using AI in their research tend 
to be credentialed as historians, not as com- 
puter scientists. There is, however, an in- 
teresting exception. Kenneth L. Jones is an 
avid avocational genealogist who works with 
the Cartographies Application Group at the 
Jet Propulsion Laboratory (JPL) in Pasa- 
dena, California. As a hobby, Jones began 
using his AI background to unravel his 
family genealogy. The system he devel- 
oped was fairly comparable to those al- 
ready described: It provides records linkages 
by disambiguating individuals, families, and 
geopolitical boundaries. But the applica- 
tion's level of sophistication caught the at- 
tention of the American intelligence 
community* In developing the system, Jones 
produced a form of knowledge represen- 
tation (the depiction of knowledge as sym- 
bols in a form that a computer can 
manipulate), which he refers to as "Knowl- 
edge visualization." Knowledge visuali- 
zation entails the use of graphics to clarify 
or make more intelligible the relationships 
among interrelated fragments of knowl- 
edge. Conferring with colleagues at the JPL, 
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Jor.es realized that the intricate matrices he 
was developing to assist in family research 
could be applied to any problem that in- 
volves the conceptualization of complex in- 
terrelationships among objects, such as 
tracking money-laundering or counter-ter- 
rorism activities. Jones's work on this sys- 
tem continues, with funding from the U.S. 
Army's Joint Tactical Fusion Office. 106 

Besides research-oriented applications, 
serious efforts are under way to use soft- 
ware engineering to develop a scholarly 
workstation devoted to the needs of histo- 
rians. Manfred Thaller is a historian and 
key participant in the Historical Worksta- 
tion Project sponsored by the Max-Planck- 
Institut fur Geschichte in Gottingen, Ger- 
many, an institute dedicated to fundamen- 
tal research in the humanities. 107 Since 1978, 
the institute's research has been designed 
to improve software for historians. The 
workstation project focuses on the devel- 
opment of three components: software that 
can access information from both current 
and historical sources, databases that are as 
available and easy to use as books, and 
knowledge bases that allow the other com- 
ponents to draw upon information in his- 
torical reference works. The developers plan 
to use artificial intelligence to provide 
transparent interaction between subsys- 
tems, to create new rules in the knowledge 
bases when new facts are inferred, and to 
guide users to relevant information. Var- 
ious elements of a production prototype of 
the workstation are being tested. Some are 
still under development, and some of the 
more difficult aspects of context-sensitive 
interpretation are still in the design phase. 
Among sociologists, Edward Brent rc- 



""'From a presentation made by Kenneth L. Jones 
at the Eighth Annual Intelligence Community Al 'Ad- 
vanced Computing Symposium, Grccnbclt, Mary- 
land, 12 March 1991. 

,n7 Scc Manfred Thaller, " The Historical Worksta- 
tion Project/* unpublished paper dclivcicd at the sev- 
enteenth International Congress of Historical Sciences, 
Madrid 2<) *wgust WO. ' 



fers to the current era as "the first hint of 
what it might be like to have computers that 
act less like clerks and more like col- 
leagues." 108 His remarks pertain to the early 
benefits sociologists report in using AI for 
theory development, especially to differ- 
entiate dependent variables from indepen- 
dent variables, to develop theories based on 
causal models, and to extend sociological 
theory by transforming theoretical asser- 
tions into logical ones. Sociologists are de- 
veloping applications using artificial 
intelligence for these purposes. 10<) In the field 
of literature, scholars are using natural lan- 
guage understanding for the rapid disam- 
biguation of words stored in machine- 
readable dictionaries and, within limited 
domains, to comprehend the "meaning" of 
a story. In other literary uses, expert sys- 
tems have been developed that lop and ana- 
lyze differing interpretations of text among 
readers. 11 " 

Scholars are beginning to use artificial 
intelligence as a tool to assist in the inter- 
pretation and analysis of sources in nearly 
every corner of the social sciences and the 
humanities. 111 In addition to those men- 
tioned, researchers in the fields of archae- 
ology, linguistics, music, art history, and 
design are exploring the value of "intelli- 



" w Edward Brent, "Is There a Role for Artificial 
Intelligence in Sociological Theorizing?" American 
Sociologist 19 (Summer 1988): 164. 

""Ibid., 160-64. 

ll,, Scc for instan-c, Nancy M. ldc and Jean Ve- 
ronis, "Very Large Neural Networks for Word Sense 
Disambiguation," paper presented at European Con- 
ference on Artificial Intelligence, Stockholm, August 
1990; Nancy M. Idc and Jean Vcronis, "Artificial 
Intelligence and the Study of Literary Narrative," Po- 
etics 19 (1990): 37-63; and David Miall, "An Expert 
System Approach to the Interpretation of Literary 
Structure," in Ennals and Gardin, Interpretation in 
the Humanities, 196-214. 

m Thc Foundation for Intelligent Systems in the 
Social Sciences, Arts and Humanities is a new organ- 
ization that publishes a quarterly newsletter, Intelli- 
gent Systems, devoted lo applications in these 
disciplines. For further information, contact the foun- 
dation's dircctot, Stephen Toncy, at 2205 Gabriel 
Drive, Us Vegas, Nevada 8<U19. 
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gent" tools, such as expert systems shells 
and specialized software, capable of per- 
forming functions attractive to a variety of 
disciplines. 112 During this decade, as pri- 
mary sources become more available in 
machine-readable form and as commercial 
AI software becomes more sophisticated and 
prevalent, it is likely that scholars will turn 
increasingly to AI for research assistance. 

Dissemination of Research Findings 

The scholarly obligation to report re- 
search findings is typically fulfilled through 
the publication of articles in peer-reviewed 
print journals or monographs. Until re- 
cently, the defining feature of a publication 
was its linear and printed format. But the 
emergence of electronic publishing and hy- 
permedia are challenging this definition of 
a document. The scholarly use of electronic 
publishing and hypermedia is a result of the 
dual trends toward end-user computing and 
greater connectivity. Considered together, 
these new dissemination and presentation 
formats are beginning to transform the 
manner in which findings are shared in the 
scholarly community. 

Electronic publishing. Introduced less 
than a decade ago, electronic publishing al- 
ready represents a $6.5 billion business ac- 



cording to current estimates. 113 The most 
viable commercial electronic publishing ef- 
forts involve indexing and abstracting texts 
and electronic versions of full-text print 
journals. Through electronic publishing, it 
is increasingly possible for researchers to 
access on their computers full-text versions 
of "newspapers and newswires, popular 
magazines and scholarly journals, financial 
and directory sources, and reference 
books," 114 For example, electronic ver- 
sions of more than forty medical journals 
are available in full text, as are some of the 
most important scientific and technical 
journals 115 More than three hundred full- 
text newsletters can be accessed through 
either NewsNet or Dialog files. Business 
and industry periodicals enjoy wide cov- 
erage in electronic form, as do specialized 
titles like marketing reports. 116 Unlike bib- 
liographic databases developed primarily for 
use by information specialists, full-text da- 
tabases generally are designed for the end- 
user. Researchers, enthusiastic about the 
convenience of these databases, also find 
electronic publishing attractive because it 
promises to increase the pace of publication 
and expand opportunities for dialogue among 
scholars. 

An electronic resource directory created 
by Bibliofile, Fulltexi Sources Online, 
identifies more than fifteen hundred full- 
text and information sources available on- 



ll2 For further information on shells, sec Avra 
Michclson, Expert Systems Technology and Its Impli- 
cations for Archives, National Archives Technical In- 
formation Paper no. 9 {Washington, D.C.: National 
Archives and Records Administration, March 1991), 
9-10. An example of specialized software is Ex-Sam- 
ple which helps researchers determine an appropriate 
sample size for a study. Ex-Sample is reviewed in 
Edwin H. Carpenter and Rick D. Axclson, "Slatis- 
tical and Graphical Research Methods: State of the 
Art/* In Social Science Computer Review 7 (Winter 
1989), 508. Another example, IXL's Discovery Ma- 
chine, performs pattern-matching over large amounts 
of data that typically would go undetected through 
manual analysis. For a report on its use, see Karen 
D. Schwartz, 4 'Agencies Use Software to Dig Up Links 
Among Data,** Government Computer News 19 (15 
October 1990): 60. 



"'Council on Library Resources, Communications 
in Support of Scietn . and Engineering, Report to the 
National Science Foundation. Washington, D.C.: 
Council on Library Resources, August 1990, 11-8. 

ll4 Sec Ruth A. Pagcll, "Primary FTDBs for the 
End User: New Roles for the Information Profes- 
sional," Online Review 13 (April 1989): 143. 

"Mbid., 146. The Hunt Library at Carnegie Mellon 
University is compiling an electronic full-text corpus 
of extended runs of computer science journals on ar- 
tificial intelligence. The specific journals and runs arc 
cited in a subsequent section of this paper (sec "The 
Library Profession's Response to New Forms of 
Scholarship/ Software Engineering** section). 

M '?agcll, "Primary ITDBs for the End User,*' 143- 
46. 
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line. 117 The trend watchers in the industry 
estimate that by the year 2000 much of 
scholarly and professional publishing will 
occur electronically, involving the trans- 
mittal of journals and books over high-speed 
networks by authors to the publishers, and 
then from publishers readers. 118 

Further, publishers are discovering that 
the electronic versions of certain printed 
products are beginning to turn a profit. In- 
deed, Harry Boyle of Chemical Abstracts 
Service (CAS), one of the world's largest 
indexing and abstracting companies, de- 
scribes the shift occurring in his company 
in this way: 

The revenue base for the printed 
product is shrinking. The revenue base 
for the electronic product is growing. 
Fifteen years ago the printed product 
was paying the bills. In the next five 
years, the electronic form of the 
product will be the dominant way that 
the database is used and the printed 
will become secondary. We are rap- 
idly approaching the point where the 
electronic use of the product is in fact 
generating a lot of the revenue needed 
to build the database, and the printed 
product is becoming the secondary 
concern. I don't think we will stop 
the printed product. But if you look 
at the economies inside the company, 
you'll know that electronic use is 
paying the bills and it is subsidizing 
the printed product which is an exact 
reverse of what we saw fifteen years 
ago. 119 



"'Richard Van Ordcn, "Content-Enriched Access 
to Electronic Information: Summaries of Selected Re- 
search, M Library Hi Tech 31 (1990): 28. 

""Robert Weber, "The Clouded f uture of Elec- 
tronic Publishing," Publishers Weeklv 237 (29 June 
19901: 76. 

""Jeffrey K. Pcmbcrton, "Online Interviews Harry 
Bovle on CAS's New License Policy . . . Effects on 
Searching Prices/' Onlhit 12 (March 19KK): 21, 



On the surface, electronic publishing 
seems to imply only a change in the form 
of distributing publications. But scholars in 
the social sciences and the humanities have 
begun to use the existing research and ed- 
ucation networks to engineer a new form 
of publication distinct from commercial ef- 
forts. These publications are academic- 
based, scholarly created and controlled, 
(often) refereed, electronic-only, network- 
delivered journals. Although scholarly 
electronic journals were invented only sev- 
eral years ago, already about three dozen 
have sprung up in an array of disciplines, 
along with sixty newsletters and the thou- 
sands of electronic conferences used for less 
formal communications. 120 

PSYCOLOQUY is one of the best ex- 
amples of the innovative genre of elec- 
tronic journals. 121 The journal's editor, 
Stevan Harnad, a cognitive psychologist at 
Princeton University, has edited an influ- 
ential nonelectronic journal (Behavioral and 
Brain Sciences) for more than fifteen years. 
Hamad decided to edit a scholarly elec- 
tronic journal as a result of his experience 
participating in an early electronic confer- 
ence. He characterized early users of net- 
works as primarily computer enthusiasts and 
graduate students. These two audiences 
possessed enough time and motivation to 
venture into the new medium of conferenc- 
ing, a unique form of communication that 
allows people, dispersed in time and place, 
to share ideas, ask questions, comment on 
work, and sustain narrative discussions. As 



l2n MichacI Strangclovc, Directory of Electronic 
Journals and Newsletters, cd. 1, July 1991. (To re- 
trieve electronically, contact the author at 
<441495(fruoUawa>; the directory is also available 
in prim through the Association of Research Libraries, 
Washington, D.C). Ann Okcrson, of the Association 
of Research Libraries, provided updated information 
on current journal numbers to Avra Michclson in March 
1992. 

l2, Thc PSYCOLOQUY discussion is from notes on 
a presentation by l vvan Hamad at tlv "Rcfcrccd 
Journals*' session on 21 March 1991, at the National 
Net' 91 Conference in Washington, D.C. 
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an early participant in an AI conference, 
Harnad decided to transmit work in a form 
more polished than customary, as if he were 
writing for a peer-reviewed journal. To his 
great surprise, he found the exchange tre- 
mendously helpful to his intellectual work. 
Instead of waiting several years to receive 
peer responses, he received instantaneous 
reactions to his work over the networks. 
Further, the responses arrived at the begin- 
ning of his intellectual process rather than 
at the end, as happens with conventional 
publishing. Inspired by his conference ex- 
perience, Harnad wondered what it would 
be like to experience with the best minds 
in his field the same kind of instantaneous 
dialogue he had established with computer 
enthusiasts and graduate students. This 
prompted him to create PSYCOLOQUY, a 
fully refereed, scholarly, electronic-only 
journal, sponsored by the American Psy- 
chological Association. 

PSYCOLOQUY is an interdisciplinary 
journal that publishes articles and reviews 
concerning psychology, neuroscience, cog- 
nitive science, behavioral biology, linguis- 
tics, and philosophy. Its editorial board of 
fifty scholars reflects the range of disci- 
plines published by the journal. Journal 
submissions, refereeing, editorial work, and 
distribution are handled entirely electroni- 
cally. There are currently more than two 
thousand individual subscribers on Bitnet. 
A large number of institutional subscribers 
also receive PSYCOLOQUY through Use- 
net, a network connected to most of the 
universities ar.d research institutions of the 
world, allowing all individuals at these sites 
to access the journal. In 1990, Library 
Journal named PSYCOLOQUY one of the 
year's best journals. 

Harnad contends that the most important 
difference between electronic journals and 
print publication is not the form of distri- 
bution but the medium's potentially revo- 
lutionary contribution to the furthering of 
scholarship and the creation of knowledge. 
The real contribution of the electronic me- 



dium is that it does what no other medium 
can do. Instead of waiting a year or two 
for peer feedback (the typical amount of 
time it takes to publish and then respond in 
print), and instead of receiving the feed- 
back when already strongly invested in the 
next research project, scholars enjoy rig- 
orous intellectual dialogue with one an- 
other, freed from the constraints of time 
and place, at the front end of the research 
process. 122 The instantaneous distribution 
of ideas among peers permits a new and 
critically important type of interaction that 
furthers scholarly inquiry in a way not pos- 
sible previously. The electronic medium is 
unique in its capacity to support interactive 
improvement of scholarship at a speed much 
more commensurate with the speed of 
thought. 

Other examples of scholarly, electronic- 
only journals include Post Modern Culture 
(North Carolina State University), an in- 
terdisciplinary journal of literary theory, 
culture, and creative writing; Artcom, de- 
voted to the interface of art and commu- 
nication technology; Quanta (Carnegie 
Mellon University), an electronic journal 
of science fiction and fantasy; the Bryn 
Mawr Classical Review, a review journal 
of books on Greek and Latin classics; On- 
line Journal of Distance Education and 
Communication (University of Alaska), de- 
voted to the development and practice of 
distance education; Ejoumal (State Uni- 
versity of New York at Albany), an inter- 
disciplinary journal on the theory and 
practice of electronic communication; New 



'"Stcvan Harnad, "Scholarly Skywriting and the 
Prcpublicalion Continuum of Scientific Inquiry," 
Psychological Science 1 (November 1990): 342. A 
similar point is made by Cliff McKnight in his article, 
"Using the Electronic Journal," in Scholarly Com- 
munication and Serials Prices: Proceedings of a Con- 
ference Sponsored by the Standing Conference of 
National and University I thrarics and the British Li- 
hrary Research and Development Department It -1 3 
June 1990, edited bv Karen Hrookficld (New York: 
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Horizons in Adult Education (Syracuse 
University Kellogg Project), a refereed 
journal for the field; Journal of the Inter- 
national Academy of Hospitality (Virginia 
Polytechnic Institute and Blacksburg State 
University), publishing refereed articles on 
basic and applied research on hospitality 
and tourism; and the Public-Access Com- 
puter Systems Review, exploring electronic 
access to library materials. 123 At a meeting 
recently convened by the Association of 
Research Libraries (ARL), electronic jour- 
nal editors established the Association of 
Scholarly Journal Editors, a "closed" elec- 
tronic communications list for discussing 
common concerns and new publishing ef- 
forts. 124 

Aside from scholarly controlled elec- 
tronic journals, commercial publishers are 
beginning to explore the profitability of 
publishing electronic-only academic jour- 
nals. The American Association for the 
Advancement of Science (AAAS), in con- 
junction with the Online Computer Library 
Center, Inc. (OCLC), announced the pub- 
lication of its first electronic-only journal 
in 1992. The publishers expect The Online 
Journal of Current Clinical Trials , a new 
peer-rcvicwed medical journal, to distrib- 
ute the findings of original research several 
months faster than its print counterparts. 
The journal represents the first commercial 
electronic effort to display typeset-quality 
graphs, tables, and equations. The editors 
are Edward J. Huth (former chief editor for 
nineteen years of the Annals of Internal 



'"Information on electronic journals from Stran- 
gclovc, Directory of Electronic Journals and News- 
letters; for a discussion on publishing a scholarly 
electronic journal sec Charles W. Bailey, Jr., "Elec- 
tronic (Online) Publishing in Action . . . The Public- 
Access Computer Systems Review and Other Elec- 
tronic Serials," Online 15 (January 1991): 28-35. 

'^Informal presentation mado by Ann Okcrson 
(Association of Research Libraries) to a session on 
"Non-Commercial Publishing" at the Spring meeting 
of the Coalition for Networked Information, 19 March 
1991, Washington, D.C.; also, letter from Ann Okcr- 
son to Avra Michclson dated 8 July 1991. 



Medicine), Curtis Meinert (of the Johns 
Hopkins Center for Clinical Trials), and 
Thomas C. Chalmers (associate director of 
the Technology Assessment Group, Har- 
vard School of Public Health). 125 If the 
commercial publication of scientific jour- 
nals proves successful, it is likely that their 
counterparts will emerge in the social sci- 
ences and humanities. 

Hypermedia, During the last ten years, 
hypermedia has developed into a mature 
tool that supports electronic browsing by 
allowing users to follow links through text, 
images, and audio and visual records. The 
electronic links that characterize the tech- 
nology also make it possible to compose 
and deliver research products in new 
ways.* 26 As hypermedia becomes a main- 
stream technology during titis decade, 
scholars are encountering the prospect of 
redefining the modern product of research. 
Should a hypermedia document provide au- 
tomatic links that take a reader from a foot- 
note to the actual cited work? Should 
hypermedia documents chronicle through 
links the intellectual process of discovery? 
What new typ ■< of authoring guidelines are 
necessary for icsearch products developed 
h hypermedia? Scholars are beginning to 
tackle some of the hard questions raised by 
the availability of a technology that allows 
for a more complex organization of ideas. 
The scholarly creation and consumption of 
hypermedia documents is another example 
of the trend toward end-user computing, 
further stimulated in this case by the online 
transition. 

One historian argues that the power of 
hypertext (hypermedia restricted to text) is 
that "it produces documents not intended 



lv, The Online Journal of Current Clinical Trials, 
brochure published by the American Association for 
the Advancement of Science and OCLC, ca. J 991. 

,:y, For an introduction to hypermedia, sec Jeff 
Conklin, "Hypertext: An Introduction and Survey," 
in Computer- Supported Cooperative Work: A Book of 
Headings, edited by Irene Grcif (San Mateo, Calif.: 
Morgan Kaufmann Publishers, 1988), 423-75. 



Scholarly Communication and Information Technology 



277 



to exist in printed form," 127 He describes 
the contrast between a standard history 
textbook and a hyperte Y t product through 
this example: 

Imagine a computerized book of 
documents. As you open it to the 
Monroe Doctrine, you see the several 
paragraphs of the President's address 
which make up the statement of for- 
eign policy. Gliding a mouse-di- 
rected cursor over the words, an icon 
pops up next to the words, "Russian 
Imperial Government." By clicking 
the mouse, you reveal a brief essay 
on the Russian Czar's interest in 
Alaska. The word "Czar" in that 
subtext can bring up the Czar's actual 
statements on the subject, and 
"Alaska" can trigger a map of the 
Pacific Northwest, After folding these 
asides back into the original docu- 
ment, you reach the phrase, "With 
the existing colonies or dependencies 
of any European power we have not 
interfered and shall not interfere," and 
clicking the mouse reveals an anno- 
tated list of interventions prior to 1823, 
That screen will activate a map of 
Central and South America showing 
the new revolutionary governments 
and the dates of their independence 
from Spain. 128 

Scholars already have begun to produce 
research projects in nonlinear formats. At 
Stanford University, for example, a hyper- 
media Shakespeare application created by 
Larry Friedlander allows users to view on 
a video monitor filmed versions of Shak- 
espearean plays, while viewing on another 
screen a synchronized presentation of -the 
play's text and stage blocking material. At 



'"James 13. M. Schick, Teaching History with a 
Computer: A Complete Guide (Chicago: l.vccum 
Hooks, 1090), ft3. 



any point, users can refer to dictionaries 
and historical notes to increase their un- 
derstanding of the performance. The sys- 
tem also allows users to create animated 
versions of plays, provides interactive tu- 
torial instruction on theater topics, and sup- 
ports note taking. 129 

Another hypermedia application, de- 
signed for use in an undergraduate poetry 
course, uses software to convey the ideas 
that poems are related to other poems, that 
they may be related to other art forms, and 
that they may be related to both other poems 
and other forms of art simultaneously. 130 
Since poems often refer to lines from other 
poems, use a painting to develop an anal- 
ogy, quote a piece of literature, or allude 
to a music score, a hypermedia document 
can make poetry truly come alive by using 
links to demonstrate concretely the cultural 
attachments among art forms. 

Two other projects are representative of 
efforts to use hypermedia as a new author- 
ing medium. The Faculty of Art and De- 
sign at Coventry Polytechnic in England 
considered the possibilities of using hyper- 
media as an authoring medium for four 
years. As a result of their deliberations, the 
faculty decided to allow students to submit 
the curriculum's required thesis in hyper- 
text. The thesis is a research product on the 
historical and theoretical portions of the 
curriculum. Hypermedia enables the art and 
design students to incorporate their design 
and visualization aptitudes into the organ- 
ization and presentation of a theoretical 
work. After this experiment with student 
theses, the faculty will evaluate the effec- 
tiveness of hypertext as an authoring me- 



,2<> Charlcs W, Bailey, Jr., "Intelligent Multimedia 
Computer Systems: Emerging Information Resources 
in the Network Environment,*' Library Hi Tech, 8 
(1900): 31. 

nn John M. Slatin, "Text and Hypertext: Refac- 
tions on the Role ol the Computer in Teaching Mod- 
ern American Poetry," in Humanities and the 
Computer: Sew Direct tons, edited by David S, Miall, 
(Oxford: Clarendon Press, 1990), 129 -31. 



46 



278 



American Archivist / Spring 1992 



dium and will consider the most appropriate 
contexts for its use, 131 

Finally, in anticipation of the widespread 
use of hypermedia as an authoring tool, the 
British Library Research and Development 
Department is funding Project Quartet, a 
research effort to develop a standard set of 
guidelines for creating hypermedia docu- 
ments. The principals on the project argue 
that researchers authoring in hypertext need 
guidelines for establishing nodes and links 
to provide the necessary hooks for readers. 
They contend that the skills used for writ- 
ing in paper media do not adequately serve 
the needs of scholars authoring in the elec- 
tronic age. The project hopes to establish 
global taxonomies for hypertext authoring 
that can be used across systems. 132 

Curriculum Development and 
Instruction 

The enormous amount of literature on 
computer-aided instruction makes it appear 
that faculty in the social sciences and hu- 
manities use computer technology to im- 
prove their teaching to an even greater extent 
than for research. This is not surprising, 



m Alan Dyer and Kate Milncr, "An Examination 
of Hypertext as an Authoring Tool in Art and Design 
Education,*' in Humanities and the Computer: New 
Directions, 137-48. 

l32 Cliff McKnight, John Richardson, and Andrew 
Dillon, "Hypertext Authoring: Some Basic Issues,*' 
Humanities Communication Newsletter 11 (1089): 25- 
29. Sec also other publications issued by this group, 
such as Hypertext in Context, The Cambridge Scries 
on Electronic Publishing (Cambridge: Cambridge 
University Press, 1991); "Human Factors of Journal 
Usage and Design of Electronic Texts," Interacting 
with Computers 1 (1989): 183-89; "The Effects of 
Display Size and Text Splitting on Reading Lengthy 
Text From Screen," Behaviour & Information Tech- 
nology 9, no. 3 (1990): 215-27; and Bill Tuck, Cliff 
McKnight, Marie Hayct and David Archer, Project 
Quartet, Library and Information Research Report no. 
76, (Wcthcrby, England: British Library, 1990). Cor- 
nell University is also experimenting with the usabil- 
ity of an online hypermedia presentation of thousands 
of articles published '. *' the Journal of the American 
Chemical Society. Sec Michael Alexander, "But Can 
You Read It Like A Book?" Computerworld 24 (19 
November 1990): 18. 



since the fundamental aspect of the tech- 
nological revolution is that faster, smarter 
machines affect the ways we think and learn. 
According to Maty Alice White, director 
of the Electronic Learning Laboratory at 
Columbia University's Teachers College, 
information technologies change "how we 
represent information, and therefore how 
we view a problem . . . how we analyze 
problems, and because they change that view 
and that analysis, they can change how we 
make decisions. These are intellectual tools, 
the very stuff and excitement of educa- 
tion." 133 The scholarly use of computers 
to develop instructional applications is an- 
other example of the trend toward end-user 
computing, while connectivity represents 
the key trend that allows for new styles of 
distance education. 

Teachers at every educational level are 
revising curriculums to include computer- 
supported instruction, such as simulations, 
cognitive modeling, and individual-ori- 
ented learning. The trend for an increasing 
portion of academia is toward "computer 
campuses" where students are required to 
purchase a specific set of computer equip- 
ment upon enrolling. Some universities have 
begun to fund positions devoted exclu- 
sively to helping faculty develop instruc- 
tional software or incorporate information 
technology into the classroom, 134 Com- 



'"Mary Alice White, "The Third Learning Revo- 
lution," Electronic Learning 7 (January 1988): 6. 

i:u Scc Schick, Teaching History with a Computer; 
A Complete Guide, 207-08. Schick cites Drcxcl, North 
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puter simulations are especially popular in 
many disciplines because they submerge 
students in a different social context, al- 
lowing them 'to consider "what-if" scena- 
rios. The simulations tend to be particularly 
effective in promoting an understanding of 
history because the first-person experience 
of time and circumstance helps students ap- 
preciate that the past is shaped by indivi^ 
uals reacting to social events and forces. 135 
An extensive array of simulation soft- 
ware is available for instructional purposes, 
including hundreds of applications in the 
field of history alone. For example, the ex- 
perience of the Constitutional Convention, 
complete with delegate selection and the 
ratification process, is available to students 
using simulation software. Other simula- 
tions allow students to experience U.S. 
congressional committee debates on read- 
mitting southern states to the union after 
the Civil War, or to participate in the pres- 
idential decision on whether to take action 
in the Pullman Strike of 1894. A National 
Geographic Society product simulates the 
construction of the Transcontinental Rail- 
road, challenging students to decide about 
such issues as construction, labor, and re- 
lations with Native Americans. Another 
simulation focuses on the military tactics 
used by the Soviet Union with Nazi forces. 
At Stanford University, a French History 
professor developed a simulation that es- 
tablishes a seventeenth century bourgeois 
context for students to negotiate a strategic 
marriage, consider proper investments, and 
manage the family's inheritance to promote 
their stature. Some simulations employ ar- 
tificial intelligence techniques to demon- 
strate more fully the meaning of historical 
context. For instance, Al-enhanced simu- 
lations are available for such events as the 
Russian Revolution and the development of 
the European Economic Community. 1 " 16 

1 1< Schick, Teaching History i\ith a Computer, 101- 
"For L'n extensive ctilical bibliogupl'V of lurreni 



Although simulations are one of the most 
popular forms of computer technology found 
in the classroom, other types of applica- 
tions are also in use. In an effort to com- 
puterize a full discipline's curriculum, 
Gregory Crane, of Harvard University's 
Classics Department, established the Per- 
seus Project. The Perseus database at- 
tempts to provide an interactive multimedia 
curriculum on classical Greek civilization. 
It contains a vast corpus of the discipline's 
sources, including translations of major 
Greek texts, introductory materials de- 
signed for novice students, Greek language 
texts for more advanced students, color im- 
ages and line drawings of archaeological 
artifacts and maps, essays and themes on 
key facets of Greek literature, a chronol- 
ogy, and a classical encyclopedia. The hy- 
permedia application provides course 
materials for such disciplines as art, ar- 
chaeology, classics, history, law, philoso- 
phy, and political science. In early 1992, 
Yale University Press released version 1.0 
of the Perseus database, which runs on 
Macintosh computers with the HyperCard 
program. 137 

Anticipating the availability of large vol- 
umes of humanities source materials on- 
line, the Stevens Institute of Technology, 
with support from the Humanities Grant 
Program of the New Jersey Department of 
Higher Education, is exploring how access 
to electronic source materials is apt to re- 
structure humanities education, In particu- 
lar, they are interested in learning how 
electronic texts, such as those compiled for 



simulations available for the study of history, sec 
Schick, Teaching History with a Computer, 122-45, 
and for a description of the Stanford University sim- 
ulation sec pages 100-01. The Social Science Com- 
puter Review rcgul.tily reviews commercial simulation 
software designed for educational purposes. The AI 
simulations arc mentioned in Ennals, Artificial Intel- 
ligemc, 125. 

4 n "Scc a brief review of the project in Social Science 
Computer Review 7 (Summer 1989): 211; also, 
l.oughriJge, "Information Technology," 281. 
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specific disciplines, can best be integrated 
into undergraduate course work. As pan of 
their research effort, they plan to evaluate 
student learning patterns and actual student 
performance with electronic curricu- 
lums. 138 

In a different approach, a group of fac- 
ulty at Manchester Polytechnic in England 
is developing "viewbooks" for use in his- 
tory curriculums. These disk-based books 
take the form either of annotated historical 
documents with introductions and conclu- 
sions cr of texts and tables. Approximately 
twenty-five different books are available, 
and they permit information to be retrieved 
through various techniques from the data- 
bases. The developer is currently designing 
a viewbook shell that will allow instructors 
to insert the text of their choice into the 
database. This type of application will ben- 
efit from advances in document-conversion 
scanning technology. 139 

Commercial software specially designed 
for particular disciplines is becoming avail- 
able, which will facilitate the use of elec- 
tronic source materials in classrooms. One 
recently released package, for example, 
displays nineteenth-century statistical cross- 
tabulations and regressions on France, En- 
gland, and Wales. 140 Faculty also are mak- 
ing use of AI shells; a history instructor 
found a particular software shell enhanced 
with artificial intelligence well-suited for 
an application on the Norman Invasion. This 
same instructor also chose an expert sys- 
tems shell to construct a learning tool for 



nK Scc Edward A. Friedman, James E. McClcllan 
III, and Arthur Shapiro, "Introducing Undergraduate 
Students to Automated Text Retrieval in Humanities 
Courses," in Humanities and the Computer, 103-12. 

nv Scc Richard H. Trainor, "History, Computing 
and Higher Education," in History and Computing //, 
38-39; for a discussion of scanning technologies, sec 
Timothy C. Weiskcl, "University Libraries, Inte- 
grated Scholarly Information Systems (ISIS), and the 
Changing Character of Academic Research," Library 
lit Tech 6 (1988): 15. 

,40 Scc review in Social Science Computer Review 
7 (Summer 1989): 211. 



the study of the Middle Ages. 141 The use 
of information technology by the British in 
history curriculums is so great that the 
country's academics have established a for- 
mal organization to support the exchange 
of technical resources. Headquartered at the 
University of Bath, the National Informa- 
tion for Software and Services organization 
coordinates the sharing of historically ori- 
ented software and data files among college 
professors. 142 

Apart from computer-assisted curricu- 
lums, teachers in the United States are using 
information technology to support a new 
style of education. Distance learning, in es- 
sence an improved successor to correspon- 
dence course work, interactively links 
teachers and students in scattered locations. 
During the past few years, a majority of 
states have become active proponents of 
distance learning. The findings of a recent 
survey show that thirty-two states "cur- 
rently have at least one statewide network 
for distance learning, and nearly half have 
more than one." 143 Enthusiasm for dis- 
tance learning seems to emanate as much 
from advances in storage and retrieval tech- 
nology as from telecommunication's net- 
works that expand the ability to use 
information at distant locations. Indeed, a 
study conducted by the U.S. Office of 
Technology Assessment found that dis- 
tance learning no longer serves only iso- 
lated rural schools. Rather, it has become 
the vehicle for bringing advanced, special- 
ized course work and an array of experts 
to many classrooms. Existing programs 
make it possible for a high-school student 
in Mississippi to study Japanese, and for 
Washington State to provide advanced 



m Scc Martyn Wild, "History and New Technol- 
ogy in Schools: Problems, Possibilities and the Way 
Forward,*' in History and Computing //, 30. 

u2 Scc Trainor, "History, Computing and Higher 
Education," in History and Computing If, 40. 

m Barbara Kurshan and Marcia Harrington, State- 
wide Education Networks: Survey Results (Roanoke, 
Va.: Educorp Consultants, April' 1991), 2. 
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placement English courses to all who qual- 
ify. In Maine, teachers enrolled in a mas- 
ters' program attend after-hours graduate 
courses in their classroom via distance 
learning, instead of undertaking a four- to 
five-hour commute. 144 The feasibility of 
using distance learning to maximize uni- 
versity students' control over the time, place, 
and pace of education is being evaluated 
through experimental courses. The flexi- 
bility of a distance-learning program is apt 
to be particularly attractive to full-time em- 
ployed students enrolled in advanced de- 
gree programs. 145 

The infusion of technology into educa- 
tional programs is occurring rapidly. Ex- 
amining the effectiveness of technology as 
an educational tool represents a popular area 
of research, though findings are still some- 
what preliminary. One study on the impact 
of the use of AI tutors in high-school ge- 
ometry classes found that the individually 
paced applications fostered a healthy com- 
petitiveness among students. 146 In a tradi- 
tional classroom, the students never had the 
opportunity either to get ahead of or fall 
behind one another. With the AI tutor, 
however, self-paced learning stimulated 
students to rival one another, as they would 
call out in class the "page" on the monitor 
they had advanced to through correct an- 
swers. 

The study also observed that the majority 
of students enjoyed the AI tutoring more 
than conventional classroom instruction and 
that the enjoyment translated into increased 



'"Sec U.S. Congress, Office of Technology As- 
sessment, Unking for Learning: A New Course for 
Education, OTA-SET-430 (Washington, D.C.: U.S. 
Government Printing Office, November 1989), 2-3, 
54. 

M5 Scc Gil Rogers, "Teaching a Psychology Course 
by Electronic Mail,*' Social Science Computer Re- 
view 7 (Spring 1989): 60-64. 

,4ft Scc Janet Ward Schoficld, Dcbra Evans-Rhodes, 
and Brad R. Hubcr, "Artificial Intelligence in the 
Classroom: The Impact of a Computer-Based Tutor 
on Teachers and Students/* Social Science Computer 
Review 8 (Spring 1990): 24-41. 



motivation. In addition, the study found that 
students appreciated the independence from 
adult control and that with the computer 
they were free to vent anger and frustration 
unacceptable with teachers. But probably 
most important, the research discovered that 
the students experienced the tutor as a game 
and thus associated it with play. The elec- 
tronic games popular among youth, com- 
bined with computer-assisted learning, in 
essence are preparing the next generation 
for a new era. As a result of changes oc- 
curring in education and play, young peo- 
ple are being thoroughly indoctrinated into 
the computer culture. The use of informa- 
tion technology and electronic communi- 
cation will be deeply ingrained in the next 
generation of researchers, who will have 
been computer veterans since elementary 
school. The current demands for electronic 
information available through networks in 
homes and offices can only escalate and 
deepen among tomorrow's scholars. 

Summary 

As the preceding section indicates, the 
clear trend in the modern research process 
is toward scholarly identification, use, 
interpretation, and analysis of sources in 
electronic form, and the gaining promi- 
nence of new forms of computer-assisted 
communication and instruction. The re- 
search process is already changing, and this 
change is accelerating and spreading across 
a wide range of disciplines. Because a key 
factor promoting this change is the availa- 
bility of new information technology, ana- 
lyzing how trends in information technology 
interact with current trends in scholarly 
practice can help predict the future evolu- 
tion of the research process. 

The analysis of information technology 
undertaken above points to two major tech- 
nology trends that are likely to transform 
scholarly practice: increased end-user com- 
puting and increased connectivity. . This 
analysis also implies that a number of more 
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specific technology, including artificial in- 
telligence, end-user publication and distri- 
bution, hypermedia, and visualization and 
virtual reality, are likely to have a signifi- 
cant impact on the research process. The 
effects of these trends, along with changes 
in scholarly practice that are already under 
way, point to a future in which researchers 
use computation and electronic communi- 
cation to help formulate ideas, access 
sources, perform research, collaborate with 
colleagues in their own and other disci- 
plines, seek peer review, publish and dis- 
seminate results, and engage in many 
professional and educational activities. Far 
from being visionary, this future is already 
present; It is currently being experienced 
by significant and increasing numbers of 
researchers from many disciplines. 

How should the archival profession re- 
spond to these changes in scholarly prac- 
tice? Are the techniques and functions 
developed by the archival profession to 
manage printed media adequate for the needs 
of researchers who operate in a global elec- 
tronic networking environment? Should es- 
tablished archives convert printed material 
to machine-readable form? If so, what se- 
lection criteria should be used? What con- 
stitutes the "reference function" in the age 
of research and education networks and 
electronic communication? These issues first 
are addressed through case examples drawn 
from the experience of the library com- 
munity, and then by a set of recommen- 
dations specifically designed for the archival 
profession. 

RESPONSES BY THE LIBRARY 
PROFESSION TO CHANGING 
RESEARCH PRACTICES 

On several occasions in the recent past, 
libraries and professional associations have 
sponsored inquiries into scholarly use of 
technology. For example, the American 
Council of Learned Societies conducted a 
survey in 1985 to 1986 that noted the rapid 



increase in the use of technology by the 
scholarly community. 147 In a more recent 
study sponsored by the Harvard College Li- 
brary and the American Council of Learned 
Societies, the Conference on Research 
Trends and Library Resources brought so- 
cial science and humanities scholars to- 
gether to explore new trends in research 
methods. Scholars spent several days con- 
sidering the impact of new technology, in- 
terdisciplinary research, and the use of 
innovative formats of materials on their 
work. 148 In another effort, the American 
Academy for Arts and Sciences sponsored 
an exchange between scholars and librar- 
ians to develop policy recommendations to 
improve access to library materials. A key 
observation shared by these inquiries is that 
scholars increasingly want online access to 
electronic source materials available through 
personal computers in their homes or of- 
fices. 

Visionary leaders within the library com- 
munity are beginning to implement pilot 
projects designed to improve the library's 
role in advancing scholarship and its re- 
sponse to changing research methods. These 
projects hold particular interest for archi- 
vists as the key distinction between the 
printed form of archival and library mate- 
rials is disappearing. Indeed, in an elec- 
tronic environment, concepts, such as 
"unique" and "multiple," which have been 
used to distinguish archival sources from 
library materials, are less meaningful. It is 
not surprising that librarians hold differing 
opinions regarding the most appropriate role 
for libraries in the electronic environment. 
Some librarians argue for continuity— the 
continued commitment to collection devel- 



,47 Morton and Price, The ACLS Survey of Scholars: 
Final Report of Views on Publications. Computers, 
and Libraries, 33. 

MS Lawrcncc Dowlcr, "Conference on Research 
Trends and Library Resources/* 22-23, February 1990, 
unpublished draft report (Cambridge, Mass.: Harvard 
University, Widcncr Library, n.d.) 
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opment. Those who hold this position ar- 
gue for consolidating library resources in 
the activities of selection and collection 
management and for relinquishing a role 
for libraries in converting source materials 
to electronic form. In contrast, the propo- 
nents of change claim that the continuity 
approach could mark the end of the era of 
free access to information because com- 
mercial vendors would step in to convert 
library materials and make them available 
for a fee in electronic form. The advocates 
of information-based institutions champion 
a new vision of the library without walls— 
an enterprise comprising many electronic 
librari ^ {including commercially produced 
products) that provide network access to 
patrons. Regardless of their perspective, both 
sides agree that patron demands for elec- 
tronic access to library materials will be 
met by someone. 149 This section examines 
several leading projects and programs un- 
dertaken by the library community to ad- 
dress changes in the research environment, 
focusing on four new trends in professional 
activity: (1) promoting high-performance 
connectivity, (2) conversion of printed ma- 
terials to machine-readable form, (3) soft- 
ware engineering for next-generation 
systems, and (4) transformations in profes- 
sional roles. 

Promoting Connectivity 

In the last few years, library leaders have 
forged a new political alliance with aca- 
demic computing centers and tne telecom- 
munications industry to support the 
development of high-performance comput- 
ing networks capable of rapidly transmit- 
ting huge amounts of data and high- 
resolution graphics. A high-performance 
computing network is needed because the 



l4 Tor two perspectives on the topic sec: Stephen 
E. Ostrow and Robert Zich, in Research Collections 
in the Information Arc: Vic Lihran' of Congress Looks 
to the Future, edited by John Y. Cole (Washington, 
D.C.: Library of Congress, 1990). 



several thousand academic, governmental, 
regional, and private networks that already 
operate worldwide cannot transmit data and 
images fast enough or in large enough 
chunks to keep pace with the needs of sci- 
entific research. Furthermore, faster net- 
works with higher bandwidths will expand 
infrastructure support for scholarly ex- 
change of visually-oriented material (such 
as that required for medical research), on- 
line electronic publishing, and high-speed 
interchanges of text and graphics in the arts 
and social sciences. 

Recognizing the need for infrastructures 
(or "highways") to disseminate materials 
electronically, the Association of Research 
Libraries (ARL) in 1990 joined with aca- 
demic and administrative computing cen- 
ters to form the Coalition for Networked 
Information (CNI). CNI is a collaboration 
among three distinct groups— EDUCOM, 
CAUSE, and the ARL— who have united 
to "promote the creation of and access to 
information resources in networked envi- 
ronments in order to enrich scholarship and 
enhance intellectual productivity. " 15 ° The 
most immediate focus of the coalition's work 
is to establish the National Research and 
Education Network (NREN), a federally 
supported high-performance computing 
network. In the interim, NSFNet (a net- 
work administered by the National Science 
Foundation), in conjunction with the thou- 
sands of other existing networks, serves as 
the precursor for the future operational 
NREN. 

The coalition is optimistic about imple- 
menting NREN as a gigabit-per-second 
network. In 1991, Congress passed the High 
Performance Computing Program that es- 
tablishes the mandate for NREN. Although 
the original motivation for NREN emerged 
from the scientific community's require- 
ments, the broader constituency rep- 



lM, i ; rom Coalition for Networked Information, Mis- 
sion Statement, March 1990. 
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resented by CNI envisions a network devoted 
to kindergarten through high school (K-12) 
programming, as well as leading-edge re- 
search. Indeed, EDUCOM recently desig- 
nated a full-time staff position for the 
development of network K-12 programs. 
CNI's commitment is to t!ie development 
of a network available to all the nation's 
teachers, students, and researchers. 151 

When fully implemented, NREN will al- 
low researchers at universities, national 
laboratories, nonprofit institutions, govern- 
ment research centers, and private industry 
to exchange sources, communicate in real 
time, share preliminary finjings, and dis- 
seminate publications electronically. In- 
deed, the dramatic changes in the ways 
research is conducted and information is 
exchanged are key factors driving the de- 
velopment of NREN. Through remote ac- 
cess hookups, NREN will provide the 
nation's researchers and students, regard- 
less of the type and size of their college, 
with the same computing tools, data files, 
supercomputers, electronic libraries, spe- 
cialized research facilities, and educational 
technology. 152 It is anticipated that NREN 
will support the transmittal ov at least 1 
billion bits of data every second by 1995. 

Recognizing the impact a network with 
such unprecedented speed and capacity will 
have on their institutions, librarians have 
joined with other information professionals 
to support the development of NREN. As 
coalition members, librarians are partici- 
pating in a range of NREN-related activi- 
ties, including CNI's seven working groups 
on: (1) encouragement of academic pub- 
lishing; (2) expansion of commercial elec- 
tronic publishing; (3) development of 



,M From Kenneth King, president, EDUCOM, un- 
published paper presented at the "NREN Governance 
and Policy" session al National Nct'Ol Conference, 
22 March 1991, Washington, D.C. 

'^Scc NREN; The National Research and Educa* 
tion Nf *ork (Washington, D.C: Coalition for the 
National Rcsc,« rch and Fducation Network, 1989). 



network architectures and standards; (4) 
formation of proposals for legislative codes, 
policies, and practices; (5) organization of 
directories and resource information serv- 
ices; (6) creation of teaching and learning 
programs; and (7) improvement of network 
management and user education. 

Through the activity of building a high- 
performance network, a new vision of the 
library is emerging. No longer simply a place 
to visit, libraries are becoming "virtual en- 
terprises" of electronic information. 153 

Conversion 

As a concrete step toward the realization 
of networked electronic libraries, some re- 
positories have begun to convert to ma- 
chine-readable form records originally 
created on paper. The American Memory 
Project at the Library of Congress (LC) 
represents a leading example of this type 
of effort. hv * Over the next five years, the 
Library of Congress, with nearly $1 million 
per year in congressionaliy appropriated 
funds along with private donations, will 
convert into electronic form large archival 
collections from their holdings relating to 



l * x For additional information on NREN sec most 
recent issues of EDUCOM Review, also, Jean Loup, 
National Research and Education Network: Overview 
and Summary (Washington, D.C: Association of Re- 
search Libraries, July 1990); Charles E. Catlctt, "The 
NSFNcl: Beginnings of a National Research Inter- 
net," Academic Computing 3 {January 1989): 18-21, 
59-64; Stephen B. Gould, "Computing and Telecom- 
munications in the Federal Government," CRS Re- 
view 11 (July/August 1990): 12-15; for information 
on CNI, sec organizational papers available from Paul 
Peters, CNI, 1527 New Hampshire Avenue, N.W., 
Washington, D.C. 20036. 

'-""Additional programs now under way include the 
Hunt Library at Carnegie Mellon University and the 
Image Transmission Program at the National Agri- 
cultural Library. Other libraries arc creating CD-ROMs 
on specialized subject areas. The Marine Corps has 
announced that it is compiling an online version of 
the Marine Corps University warfighting collection 
that will allow marines to "fight smart" whcrcvvi 
they arc stationed; sec Kevin M. Bacrson, "Marines 
Put Library On-Line," Federal Computer Week, 5 (2 
September 1901): 1, 4. 
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American culture and history. 155 The pur- 
pose of the project is to use advanced tech- 
nology to make electronic versions of 
collections available to libraries across the 
country. 

The collections chosen for the initial round 
of conversion primarily document aspects 
of tum-of-the-century life in America. They 
are drawn from a cross-section of original 
formats, including rare pamphlets, eariy 
motion pictures, sound recordings, per- 
sonal papers, and still photographs. A va- 
riety of image, text, and audio types will 
be linked to catalog information in the stan- 
dard MARC (MAchine-Readable Catalog- 
ing) format. 

In fiscal year 1991, the Library of Con- 
gress prepared four collections for elec- 
tronic dissemination, including about 300 
broadsides from the Continental Congress 
and Constitutional Convention; three hours 
of sound recordings of speeches (sixty ex- 
amples) of political leaders during World 
War I and the presidential election of 1920; 
two dozen short motion pictures of Presi- 
dent McKinley at the start of his second 
term and at the 1901 Pan-American Exhi- 
bition in Buffalo, New York; and about 
25,000 photographs from a well-known 
postcard and scenic-view company founded 
by William Henry Jackson. By the end of 
1992, the library will supplement these with 
collections of Civil War photographs, ap- 
proximately 350 African-American pam- 
phlets (11,000 printed pages written between 
1820 and 1910), local history books from 
California, early films of New York City, 
and life histories from the Federal Writers' 
Project. 

The library's selection process attempts 



t5S Thc American Memory project has received gifts 
from the David and Lucille Packard Foundation, the 
Anncnbcrg Fund, Inc., Armand Hammer's Occidental 
Petroleum Corporation, and Jones International, Ltd.. 
as well as gifts or loans of equipment from Apple 
Computer, IBM, and Pioneer. Sec Library of Con- 
gress, "American Memory," LC Information liuUv- 
tin (26 February 1990): 83-87. 



to strike a balance between popular, readily 
available collections and unprocessed col- 
lections that comprise a backlog arrearage. 
Selecting an arrearage collection provides 
an impetus for processing it. As selections 
are made, the planners consult both with 
Library of Congress curators and with out- 
side scholars. The first set of American 
.Memory collections is being evaluated in 
forty school, university, public, and special 
libraries to assess patterns of use. The re- 
sults of this evaluation will provide further 
guidance. 

Compared with all the holdings of the 
Library of Congress, American Memory will 
convert only a relatively small amount dur- 
ing the first few years. The program's ex- 
tent reflects the high cost of conversion, 
the institution's desire to reduce its arrear- 
age, and the typical difficulties encoun- 
tered in the introduction of a new 
technology. To ximizc the use of what 
it has prepared, however, the library is 
placing special emphasis on educational 
applications. Besides providing the collec- 
tions proper, American Memory's presen- 
tation also will include introductory 
information in interactive, computerized 
form and in print. 

The ultimate goal of the American Mem- 
ory project is to make materials available 
via telecommunications, but this goal will 
be fully realized only in the later 1990s. 
Until then, the collections will be dissem- 
inated on disks: CD-ROMs for digital in- 
formation and analog videodiscs for motion 
picture and some still photographic collec- 
tions. But whether on disk or in a network, 
every American Memory working proto- 
type will model what Ricky Erway, an 
American Memory associate coordinator, 
describes as a ''library without walis." 
American Memory will be operating as a 
pilot project through 1995. ,Vt 



|V 'lor further inlornuiion on Amencm Vrmory, 
contact the Library of Congress, Special Projects Of- 



54 



286 



American Archivist / Spring 1992 



Software Engineering 

Many libraries are considering ways to 
expand bibliographic access as part of their 
plans to develop next-generation library 
systems. 157 But few are taking as ambitious 
or comprehensive an approach to the process 
as the staff at Carnegie Mellon's University 
Libraries. With a $1.2 million grant from 
the Pew Memorial Trust and several mil- 
lion dollars of donated hardware from Dig- 
ital Equipment Corporation, the library is 
developing a system that will provide the 
university's faculty, students, and admin- 
istrators with access to bibliographic data- 
bases, full-text documents, and network 
gateways. 158 Library Information System II 
(LIS II), implemented in 1991, is designed 
to improve the quality of retrieval and de- 
livery of textual information to users. In a 



ficc, Washington, D.C. 20540, (202) 707-6233. In- 
formation on the project from discussions by Avra 
Michelson with Ricky Erway on 28 December 1990 
and Erway and Carl Flcischhaucr on 4 February 1991 
and from documents supplied by the Library of Con- 
gress. 

l,7 For the development of enhanced bibliographic 
records, sec, for instance, Van Ordcn, 4 4 Content- En- 
riched Access to Electronic Information," 27-32; Flo 
Wilson, "Article-Level Access in the Online Catalog 
at Vandcrbilt University," Information Technology' and 
Libraries 8 (June 1989): 121-31; and Katharina 
Klcmpcrcr, "New Dimensions for the Online Cata- 
log: The Dartmouth College Library Experience," In- 
formation Technology and Libraries 8 (June 1989): 
1 38—45; the Klcmpcrcr article also discusses Dart- 
mouth's approach to the development of an integrated 
campuswiu'* information system. 

,J "Information for this section is from a site visit 
by Avra Michclson to the University Libraries that 
included meetings with Thomas Michalak, Tom Do- 
pirak, and Dcnisc Troll on 27 March 1991. Sec also 
two reports on the work of the project: Dcnisc A. 
Troll, Library Information System U: Progress Report 
and Technical Plan, Mercury Technical Report Sc- 
ries, no. 3 (Pittsburgh, Pa.: Carnegie Mellon Univer- 
sity, 1990); and Nancy H. Evans ct. al., The Vision 
of the Electronic Library, Mercury Technical Report 
Scries, no. 1 (Pittsburgh, Pa.: Carnegie Mellon Uni- 
versity, 1989). Also sec William Y. Arms and Thomas 
J. Michalak, "Carnegie Mellon University," in Cam- 
pits Strategies for Libraries and Electronic Informa- 
tion, edited by Caroline Arms (Bedford, Mass.: Digital 
Equipment Corporation, 1^90), 243-73. 



bold departure from the standard approach 
to library automation, Carnegie Mellon 
separated its public catalog from other li- 
brary administrative functions. As such, LIS 
II is devoted strictly to user-oriented re- 
trieval, whereas OCLC's LS/2000, an au- 
tomated system with integrated modules, is 
in use for -other aspects of library admin- 
istration. 

The technical goal of LIS II is to produce 
for networked campuses an affordable li- 
brary retrieval system that adheres to avail- 
able standards. During the first phase, the 
system will run on University Library in- 
stalled workstations. Since January 1992, 
LIS II has been available across campus 
through workstation or VT 100 access. A 
Macintosh interface is scheduled to be re- 
leased by the end of 1992. The application 
goals of the current system are to provide 
the following: 

• Online bibliographic access to all uni- 
versity resources 

• Bibliographic access at the article level 
to journal literature 

• Electronic access to external data- 
bases 

• Online access to a range of campus 
information 

® Online access to textual information 159 
The system's distributed architecture has 
been designed to support further research 
and development toward the realization of 
an electronic library. 

Although the system's software supports 
standard bibliographic retrieval, it also pro- 
vides enhanced access to select antholo- 
gies, plays, edited collections, exhibition 
catalogs, and conference proceedings. Sev- 
eral thousand bibliographic records ior these 



l59 Thc University of California at Berkeley's Office 
of Information Systems and Technology also is de- 
veloping a campus networked information system to 
support bibliographic and nonbibliographic databases, 
full-text documents, nontextual documents, and hy- 
permedia links. 
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types of publications have been embel- 
lished manually or by establishing system 
links with nearly one dozen commercial 
products that include tables of contents, ti- 
tle pages, and book reviews. One-page ab- 
stracts are included in the bibliographic 
records of campus-issued scientific and 
technical reports. The inUnt of this type of 
record enhancement is to improve the rel- 
evance of system retrievals. 

Besides record enhancement, the staff 
plans to mount two types of full-text da- 
tabases, journal articles and campuswide 
information, on the system. Elsevier, Per- 
gamon, and the Association of Computing 
Machinery (ACM) have agreed to provide 
the University Libraries with machine- 
readable journals and technical reports in 
the subject field of computer science. ACM 
will provide extensive runs of four of its 
publications: Computing Reviews (ten 
years), Collected Algorithms (twcnty-t'ivc 
years), Communications (two years), and 
Guide to Computing Literature (ten years). 
Carnegie Mellon is also negotiating an 
agreement to make the publications of the 
American Association for Artificial Intel- 
ligence available in machine-readable form, 
and it is working with academic research 
institutions to collect machine-readable 
computer science technical reports. Con- 
centrating the full-text offerings in an area 
such as artificial intelligence and computer 
science will allow the University Libraries 
to further evaluate scholarly information 
needs by studying the use of textual infor- 
mation in a single d< 'line. 

The University Libraries also are install- 
ing a CD-ROM jukebox system from Uni- 
versity Microfilms, Inc. That system 
includes full-text images of general and 
business journals linked to bibliographic ci- 
tations in tape-mounted databases on LIS 
II. In the final phase of the project, the 
images will be delivered to workstations 
across campus. 
The full-text, campus-oriented docu- 



ments require an indexing scheme entirely 
different from that developed for standard 
bibliographic data. The new system will 
provide campus software licensing and 
availability information, career and place- 
ment resources, the Carnegie Mellon Pol- 
icies and Procedures Manual, the 
undergraduate catalog, user help files for 
other campuswide systems, listings of fac- 
ulty and staff publications (including re- 
search profiles), and indexes and full text 
of campus newspapers. Standard office ref- 
erence materials, such as phone books, en- 
cyclopedias, and dictionaries, are already 
available. 

Development of the system's user inter- 
face is based on staff findings on user work 
habits and information-seeking behaviors. 
According to the research, patrons rarely 
refer to documents in isolation from other 
activity. For this reason, the LIS II archi- 
tecture has been designed to integrate with 
a larger work environment, supporting 
linkages to word processors, databases, e- 
mail, and parallel applications. Toolkits 
(special software routines) permit LIS II 
users to make individual databases avail- 
able across the network. Other features al- 
low patrons to store searches for reuse, move 
in one keystroke from a journal article ci- 
tation to the full text of the article, and 
improve queries by browsing indexes that 
reveal how often terms are used. The win- 
dowed screen environment can be custom- 
ized by each user. 

The creation of an electronic library linked 
to other electronic libraries requires sus- 
tained effort. LIS II provides in substantial 
measure an architecture 'o support full-text 
electronic delivery of documents in librar- 
ies. In creating this system, the developers 
clarified many issues and resolved other 
important issues in the areas of distributed 
storage and retrieval systems, information 
capture and representation, information re- 
trieval and delivery, and management and 
economic concerns. Carnegie Mellon plans 
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to make the software developed for LIS II 
available to other libraries. 

Transformations in Professional Roles 

Library literature contains many propos- 
als for new roles for library professionals 
in the electronic age. 160 Among these the 
programmatic achievements of the Labo- 
ratory for Applied Research in Academic 
Information serves as one of the best op- 
erational models for redefining the librari- 
an's role on campus. A division of the 
William H. Welch Medical Library at The 
Johns Hopkins University, the laboratory 
is a collaboration among academic schol- 
ars, scientists, and librarians. They share 
responsibility for the creation, structuring, 
representation, dissemination, and use of 
scholarly knowledge through the use of 
computing and communication technology. 
Created in 1987 by Nina W. Matheson and 
Richard E. Lucier, the Laboratory explores 
strategies for integrating the library more 
fully into the scholarly communication 
process. 161 Lucier has developed what he 
terms the "knowledge management model," 
which extends the library's traditional stor- 
age and retrieval and information transfer 



,ft °Scc, for instance, the ideas developed by Eldrcd 
Smith in his book The Librarian, the Scholar, and 
the Future of the Research Library (New York: 
Greenwood Press, 1990), especially 60-63 and 83- 
84. Articles by Bert B. Boycc and Kathleen M. Hcim, 
"The Education of Library Systems Analysts for the 
Nineties/' and John Corbin, "The Education of Li- 
brarians in an Age of Information Technology," in 
Computing, Electronic Publishing and Information 
Technology: Their Impact on Academic Libraries, ed- 
ited by Robin Downcs (New York: Haworth Press, 
1988), 60-63 and 83-84, respectively; and Timothy 
C. Wciskcl, "University Libraries, Integrated Schol- 
arly Information Systems (ISIS), and the Changing 
Character of Academic Research," Library Hi Tech 
6 (1988): 7-27. 

,M This section is based on briefings of Avra Mich- 
clson by Richard Lucier and Valeric Florence, 7 May 
1991; sec also Richard Lucier, "Knowledge Manage- 
ment: Refining Roles in Scientific Communication/' 
EDUCOM Review 25 (Fall 1990): 21-27. For infor- 
mation on particular projects, sec Welch Library Is- 
sues, vol. 2, nos. 1, 4, and 6. 



functions to include a third function, 
knowledge management. 

In the knowledge management model, li- 
brarians are teamed with content special- 
ists, software engineers, and social scientists 
to identify the specialized information needs 
of a constituency and then address the needs 
with the aid of information technology. In 
this model, the laboratory performs three 
types of work: (1) knowledge base and 
software development; (2) research and 
scientific support through ongoing needs 
assessments and quality control of data, ed- 
ucation and training; and (3) service through 
the management of the computing and 
communications infrastructure. The social 
scientists assess information needs by using 
standard methodologies, such as partici- 
pant observation, formal and unstructured 
interviews, and document analysis, 

The laboratory recently received a three- 
year grant from the Council on Library Re- 
sources (CLR) to document the knowledge 
management .model and explore the feasi- 
bility of implementing the model in 
nonmedical environments. The CLR funds 
also support an invitational symposium on 
knowledge management, The laboratory's 
key projects have been the development of 
the Online Mendelian Inheritance in Man 
(OMIM) and the Genome Data Base, which 
are comprehensive scientific sources used 
by geneticists worldwide for gene map- 
ping, genetic disease diagnosis, and patient 
care. These online projects allow an inter- 
national group of scientists to collect, or- 
ganize, and electronically distribute mapping 
and disease information on approximately 
100,000 genes that regulate human health 
and development. The constantly evolving 
Genome Data Base is maintained by more 
than one hundred scientists around the worM. 
Lucier considers the database to be a form 
of dynamic, interactive publication that, 
unlike static print publications, always pro- 
vides the most current information and 
analysis by the most respected scientific 
authorities. 
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Through the development of the Genome 
Data Base, OMIM, and other projects, the 
laboratory has demonstrated that knowl- 
edge management represents a "practical 
working alternative to existing roles and re- 
lationships in the creation and management 
of scholarly knowledge." 102 Lucier will 
expand his work in the development of the 
new Center for Knowledge Management at 
the University of California at San Fran- 
cisco. 

This section reviewed some of the li- 
brary community's strategies. The next 
section recommends actions that the archi- 
val profession can take to respond to 
changing research methods. These actions 
are an important step toward confronting 
the transformation of scholarly practice that 
is as imminent as the new millennium. 

CONCLUSION AND 
RECOMMENDATIONS 

The scholarly use of information tech- 
nology is resulting in dramatic changes in 
research practices. Essentially two trends 
are evident: one toward end-user comput- 
ing and the other toward connectivity. To 
an increasing extent, social scientists and 
humanists are performing their own com- 
putation in the context of ever greater con- 
nectivity. The scholarly use of computers 
and communication technology for re- 
search and information exchange has both 
short-term and long-term ramifications for 
archival practice. In the short term, the ar- 
chival profession needs to address the in- 
creasing prominence of network-mediated 
scholarship. In the long term, the role of 
the archival profession in the development 
of next-generation archives that operate in 



IA2<< Knowledge Management: A Collaboration of 
Academic Scholars, Scientists and Librarians," un- 
published statement on the three-year project spon- 
sored by the Council on Library Resources, The 
William H. Welch Medical Library, Laboratory for 
Applied Research in Academic Information (15 July 
1990). 



conjunction with global networks needs to 
be defined. The following recommenda- 
tions suggest concrete actions the archival 
profession can take to address both of these 
issues during the next decade: 

• Establish a presence on the Internet/ 
NREN. 

• Make source materials available for 
research use over the Internet. 

• Create documentation strategies to 
document network-mediated scholar- 
ship and the development of research 
and education networks as a new 
communications medium. 

• Develop archival methods suitable for 
operation with NREN. 

• Take user practices and computational 
capacity into account in establishing 
policies on the management of soft- 
ware-dependent records. 

• Recognize and reward initiatives that 
advance (a) the archival management 
of electronic records; (b) the response 
to scholarly use of information tech- 
nology; and (c) a network-mediated 
archival practice. 

These recommendations are considered in 
the three-part discussion below. 

Part I: Establishing a Network- 
Mediated Archival Practice 

The archival profession, first and fore- 
most, must respond to the emergence of 
network-mediated scholarship. New meth- 
ods of searching for sources, communicat- 
ing with colleagues, disseminating research 
findings, and providing instruction suggest 
that scholarly communication is increas- 
ingly mediated through electronic net- 
works. The existing Internet and the future 
NREN represent the new meeting ground 
where scholars turn for bibliographic in- 
formation, scholarly dialogues and feed- 
back, the most current publications in their 
fields, and high-level educational offer- 
ings. Increasingly, full-text versions of 
journals, magazines, newsletters, and even 
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primary sources are available through net- 
works. In response to this new phenome- 
non, the archival profession needs to 
establish a presence on research and edu- 
cation networks and to evaluate the impli- 
cations of new forms of scholarly 
communication for standard archival prac- 
tice. 

But before attempting to introduce pol- 
icy or collaborative action, the archival 
profession must start using the networks. 
Indeed, the use of networks is the chief 
action archivists can take in response to 
changing patterns of scholarly communi- 
cation. A presence on the Internet is essen- 
tial if archivists are to establish credibility 
as legitimate network collaborators. 

Establishing an archival presence on the 
networks is affordable. Telecommunica- 
tions hookups involve a modem, commu- 
nication software, and an e-mail address 
provided through a link to an already ex- 
isting network connection. For archivists 
who do not already possess a modem and 
who choose not to use public domain com- 
munication software, the cost entails a one- 
time expenditure of, at most, several hundred 
dollars. Ongoing connect charges in the 
United States are minimal. Most archivists 
should experience little trouble obtaining 
an electronic mail address because the ma- 
jority of campuses are already wired for 
network connections, as are federal and state 
agencies and many private organizations and 
corporations, especially tho: affiliated with 
scientific research and development. In fact, 
several hundred archivists' 163 already par- 
ticipate on BITNET in the network list Ar- 
chives and Archivists. Once hardware and 
communication software are in place, the 
archival profession can become an Internet 
participant. 



IM As of June 1992, approximately 440 archivists 
subscribed to the Bitnct Archives and Archivists list* 
scrv. 



Recommendation 1: Archivists should 
begin monitoring and responding to 
scholars' intellectual activities conducted 
on networks. 

Besides the standard methods for keep- 
ing current on research trends, archivists 
should participate in scholarly electronic 
conferences. To participate, one signs up, 
or "subscribes," to a conference. Because 
thousands of conferences exist, archivists 
should use conference lists and compiled 
directories to select those that involve sub- 
ject areas most closely approximating the 
holdings of their repository. For instance, 
a repository strong in women's history 
sources may subscribe to the lists devoted 
to women's and gender studies. An insti- 
tution noted for its collection of pre-Civil 
War holdings may choose a conference de- 
voted to eighteenth century America. So- 
cial welfare archives may sign up for 
conferences related to social work, social 
activism, and family studies. Those with 
strong collections of Utopian records may 
select the Shaker conference. Repositories 
noted for their hold ; -\<* s on the arts may join 
the many conferences on theater, film, and 
drama. 164 

One way scholars use these conferences 
is to exchange information about source 
materials related to research topics. In an 
effort to participate in these dialogues, 
NARA's Center for Electronic Records be- 
gan monitoring several scholarly confer- 
ences in 1991. The conferences offer the 
center a forum for responding to several 
dozen additional inquiries each month from 
scholars and librarians relating to the cen- 
ter's holdings. One center staff member 
currently spends about thirty minutes each 
day monitoring four BITNET Listservs on 
topics related to government documents, 



,M Examples of elCLMronic conferences arc from Ko- 
vacs, Directory of Scholarly Electronic Conferences, 
3rd rev. 



53 



Scholarly Communication and Information Technology 



291 



electronic data sets, social science data lists, 
and the Vietnam War. 165 

These conferences not only provide a 
means for keeping up with trends in schol- 
arly research but also provide a mechanism 
for establishing a presence on the networks 
by attaching a name and institutional affil- 
iation to each communication. As simplis- 
tic as this sounds, a more substantive 
involvement with networks can occur only 
when archivists are familiar with the are- 
na's discourse and techniques and when the 
archival profession is established as a net- 
work participant. We therefore recommend 
as an initial action that archivists establish 
a presence on the Internet by participating 
in network conferences. 

Recommendation 2: The archival 
profession should identify and implement 
archival methods appropriate to new 
forms of scholarly communication. 

Establishing a presence on the networks 
is a necessary first step. But in addition to 
conference participation, the archival 
profession should pursue archival methods 
responsive to changes in scholarly com- 
munication. These new archival practices 
and techniques include: providing access 
on the Internet to source materials in ma- 
chine-readable form, initially as bit-mapped 
images; documenting the activities of net- 
work-mediated scholarship; and establish- 
ing archives that operate in the Internet/ 
NREiN environment. 

2 (a): The archival profession should 
make source materials available on the 
Internet. The archival profession should 
make sources directly available to scholars 
via research and education networks. The 
sources should include both records that 



originate in electronic form and those cre- 
ated in nonelectronic forms. Since the 
transfer of nonelectronic records to ma- 
chine-readable form is a formidable under- 
taking, this discussion focuses primarily on 
conversion strategies. 

Converting nonelectronic sources to ma- 
chine-readable form is justified for several 
reasons. First, the scholarly expectation that 
full-text materials should be available on- 
line as a research convenience is unmistak- 
ably evident and growing. 166 Indeed, 
electronic document delivery represents the 
undisputed standard for the information 
field. Second, beyond convenience, con- 
version of source materials to machine- 
readable form is essential for analyses that 
rely on computational processing. Third, 
with increasing frequency, the types of 
questions posed by researchers require en- 
tire electronic libraries of sources, instead 
of a single collection, available for com- 
putational processing. From this perspec- 
tive, the larger the corpus of converted 
collections, the greater the research value. 

As further justification, in the absence 
of an archival role in the conversion of 
source materials, the commercial sector is 
certain to prevail. This is not to suggest that 
many types of conversion projects would 
not be more suitable as commercial sector 
undertakings. But as the transition to the 
online era proceeds, archivists have the re- 
sponsibility to ensure that publicly avail- 
able records remain so when converted to 
machine-readable form and to alert citizens 
to the danger of losing the right of free 
access through inaction. 

The proposal to convert source materials 
to machine-readable form is neither radical 
nor original. Many leaders in the library 
profession argue that conversion is one of 



'^Conversation between Avra Michclson and Ted 
Hull of NAR/Vs Center for Electronic Records, 26 
August 1991; also Ted Hull, ,4 NNXA Reference Re- 
port," NARA Center for Electronic Records, June 
1991, draft. 



■"'Shrinking travel allocations also may spur re- 
quests for online access, if the cost of geographically 
dispersed archival research exceeds academic bud- 
gets. 
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the most important actions librarians can 
take to establish a comprehensive record of 
scholarship, 167 As discussed earlier, some 
libraries are already performing pilot con- 
versions. Further, the Commission on Pres- 
ervation and Access recently released several 
reports recommending that preservation 
microfilming include the generation of dig- 
ital images. 168 In an alternative approach, 
Cornell University Library, in conjunction 
with Xerox Corporation and the Commis- 
sion on Preservation and Access, demon- 
strated the feasibility of directly converting 
text to digital form, avoiding the costs as- 
sociated with microfilming. 169 

In arguing that archivists should convert 
nonelectronic holdings to machine-reada- 
ble form, we are not suggesting that it is 
either feasible or desirable to convert all 
records. The volume of archival holdings 
is simply too great, and many holdings do 



l ' ,? Scc, for instance. Smith, The Librarian, the 
Scholar, and the Future of the Research Library, 71- 
72. Clifford Lynch also recommends conversion of 
source materials to digital form in "Achieving the 
Promise: A Proposed Strategic Agenda for Libraries 
and Networked Information Resources in the 1990s," 
unpublished paper presented at the Networks for Net* 
workers II Prc-Confcrcncc, Chantilly, Virginia 17- 19 
Dc "mbcr 1990, 18 (Also published under that title 
in fretworks for Net workers: Critical Issues for Li- 
braries in the National Network Environment, edited 
by Barbara Evans-Markuson with Elaine W. Woods 
[New York: Ncal-Schumrm Publishers, forthcoming]). 

K,8 Scc Donald J. Waters, From Microfilm to Digital 
Imagery (Washington, D.C.: Commission on Pres- 
ervation and Access, June 1991), and Michael Lcsk, 
Image Formats fo," Preservation and Access (Wash- 
ington: D.C.: Commission on Preservation and Ac- 
cess, July 1990). These reports explore microfilming 
as a means to achieve digitization. 

,fty Thc Cornell project, co-managed by Anne R. 
Kcnncy and Lynnc K. Pcrsonius, involves the direct 
conversion of one thousand volumes of brittle books 
to digital form. Half of the volumes arc mathematical 
books, some of which arc handwritten or contain for- 
mulas and graphic images. The Cornell project uses 
Xerox hardware that is capable of producing both dig- 
ital output and enhanced print output from a digital 
copy. This collaborative effort has produced mean- 
ingful data on costs, procedures, and models associ- 
ated with digitization programs useful to the archival 
profession. Sec Kcnncy and Pcrsonius, "The l ; uture 
of Digital Preservation." 



not warrant the investment. Rather, our point 
is that it is time to begin breaking the tie 
with the printed past and establishing a 
connection with the machine-readabie fu- 
ture. 

Converting source materials to machine- 
readable form entails the resolution of many 
issues that are beyond the scope of this pa- 
per. However, we wold like to comment 
on a few basic archival questions related to 
conversion: What should be converted? 
What electronic form should conversion re- 
sult in? What kind of new descriptive de- 
vices are necessary to facilitate the 
independent use of electronic versions of 
source materials? 

Wiiat should be converted 0 . Most repo- 
sitories periodically, if not regularly, mi- 
crofilm deteriorating collections of enduring 
value. Applying current technology, mi- 
crofilm preservation projects could be ex- 
panded or transformed to digital conversion 
projects through the development of sev- 
eral funded, model programs. The benefit 
of establishing digital conversion programs 
based on preservation microfilming is that 
many procedures in place for microfilming 
are also suitable for imaging. First, mate- 
rials for preservation microfilming typi- 
cally are selected because they are in need 
of preservation attention and are deserving 
of wider access. These two elements are 
adequate criteria for the current selection 
of collections to be digitized. 570 



,70 0ther categories of records also may make good 
candidates for conversion even though they arc not 
deteriorating. In selecting records primarily to provide 
greater access, other factors should be considered, in- 
cluding the nature and extent of use of the records, 
the institutional visibility or impact afforded by the 
cc.^crsion, the type of image required for use, the 
volume and condition of the records requiring con- 
version, special labor costs, and the extent to which 
conversion can be accomplished through scanning, 
optical character recognition (OCR), or manual input. 
But wc think it would be a mistake for the archival 
profession to expend much effort at this point on re- 
fining selection criteria until the results of a number 
of digital conversion projects can be analyzed. Fur- 
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Second, the document preparation 
processes used with microfilming are largely 
compatible with digital conversion. 171 This 
means that handling procedures in place for 
preservation microfilming can essentially 
be applied to digitization. Third, micro- 
filming and digitizing can be intertwined 
technical processes. That is, while it is 
technically possible to generate a microfilm 
copy as output from a digitized collection, 
it is also possible to generate a digitized 
copy of a record set from microfilm output. 
This means that it is possible to create mi- 
crofilm and then digitize the output, or dig- 
itize directly and then generate microfilm. 
As such, repositories concerned with the 
longevity of digital storage mediums, or their 
ability to move digital data from one gen- 
eration of technology to another, can con- 
tinue to rely on microfilm for preservation 
purposes and still convert records to ma- 
' chine-readable form. 

Repositories that plan to microfilm are 
encouraged to establish pilot digital pro- 
grams that draw on many structures already 
in place for preservation microfilming. The 
archival profession needs tested models to 
establish the most cost-effective procedures 
for administering ongoing conversion pro- 
grams. Pilot projects should provide suffi- 
cient technical and programmatic guidance 
and an awareness of how digital sources 
are used, to equip the profession with the 
ability to implement large-scale digital con- 
versions. 

Wliat electronic form should digital con- 
versions result in! The profession's assess- 



thcr, the Commission on Preservation and Access has 
contracted with Margaret Child to reconsider current 
criteria used to select source materials for preservation 
microfilming. Presumably the archival profession will 
find the results of this study relevant to digital con- 
version efforts as well. 

1 'See Archival Research and Evaluation Staff, Op- 
tical Digital Image Storage System: Project Report 
(Washington, D.C.: National Archives and Records 
Administration, W\) t 6; and Kcnncy and Pcrson- 
ius/Thc Future of Digital Preservation/ * 9. 



ment of appropriate electronic forms will 
probably change over time. The overriding 
concern, however, must be to identify the 
kinds of representations patrons need. Do 
they need a facsimile image of documents? 
A stream of straight ASCII text that can be 
manipulated? ASCII text encoded with tags 
that identify document structures and for- 
mats? Although the electronic forms that 
patrons need depends on the type of re- 
search they are conducting, very little is 
known about the actual use of electronic 
documents for different types of research. 

Trends in the technology suggest that in 
the future the archival profession should be 
able to provide access to electronic sources 
both as bit-mapped images and encoded text. 
But current limitations make large-scale 
encoding of text an unrealistic undertaking. 
For many reasons, the existing methods of 
performing ASCII conversions, manual key 
entry or automatic optical character rec- 
ognition (OCR) are inadequate. For ex- 
ample, the cost of performing key entry 
with great volumes of materials is prohib- 
itive, and OCR processes are unreliable with 
handwritten script and unusual type fonts. 
In contrast, bit-mapped conversions, which 
result in image representations (like fac- 
similes, but potentially of far greater res- 
olution), are readily attainable with today's 
technology. Further, automatically con- 
verting bit-mapped images of modern printed 
documents to ASCII is typically considered 
a straightforward process (equivalent to 
OCR). If desired, encoded text can be gen- 
erated from the ASCII version, provided 
the relevant structural information has been 
retained. 

ASCII and encoded text differ from bit- 
mapped images in that the latter cannot be 
searched and computationally processed 
without considerable programming. It is 
highly probable, however, that software 
designed to encode text automatically will 
improve and reduce in cost during this dec- 
ade. If this happens, it may be feasible to 
justify largc-scalc textual encoding. Until 
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then, sources not amenable to OCR should 
be converted to bit-mapped images. But 
since bit-mapped images will not satisfy the 
research needs of certain scholars, archi- 
vists should monitor advances in OCR and 
structure-encoding software. 

Wliat kind of new descriptive devices are 
necessary to facilitate independent use by 
researchers of electronic versions of source 
materials! Digital versions of large archi- 
val collections will need specialized find- 
ing aids, descriptors, navigational aids, or 
informational hooks to facilitate their in- 
dependent use. 172 Developing these finding 
aids and navigational tools represents a key 
challenge for the information profession. 
Nonetheless,. it would be ill-advised to con- 
vert unstructured and voluminous collec- 
tions to machine-readable form, or to make 
collections that originate in machine-read- 
able form available for independent use, 
without addressing the need for a descrip- 
tive system suitable to the electronic envi- 
ronment. As a further complication, standard 
bibliographic approaches to retrieval are 
proving an inadequate method for locating 
and managing remote electronic text banks. 
But metadata, data about data that archi- 
vists typically collect about ? body of rec- 
ords, may serve as the basis for a 
supplementary descriptive system to com- 
plement existing bibliographic informa- 
tion. Administrative histories, accession 
records, and other contextual data used to 
establish the provenance of a collection may 
prove very useful in retrieving information 
from electronic sources in the absence of 
human intermediaries. 173 



l72 For a justification of the need for new access 
tools, see Clifford A. Lynch and Cecilia M. Preston, 
"Internet Access to Information Resources, " in An- 
nual Review of Information Science and Technology 
(ARIST), vol. 25, edited by Martha E. Williams (Am- 
sterdam: Elsevier Science Publishers B.V., 1990); and 
Lynch, "Achieving the Promise/' 24-25. 

'"Charles Robb, at the Kentucky Department for 
Libraries and Archives, is developing a locator system 
for statewide information using metadata to complc- 



It is encouraging to note that contextual 
information accreted to each document in 
records originating in machine-readable form 
is likely to be greater than in their print 
counterparts. For example, e-mail mes- 
sages interchanged on the Internet identify 
the sender and institution, the receiver(s) 
and institution(s), the date and time of 
transmittal, and the subject of the com- 
munication. Archival intervention into the 
design phase of software could result in the 
accumulation of other metadata that would 
be useful for both accountability and re- 
trieval purposes. We therefore endorse the 
National Historical Publications and Rec- 
ords Commission's proposal to research the 
implications of capturing and retaining data, 
descriptive information, and contextual in- 
formation in electronic form, and we spec- 
ulate that the findings of this research can 
also advance the development of descrip- 
tive systems suitable for independent use 
by end-users. 174 

2 (b): Archivists should develop and 
implement a strategy for documenting 
network-mediated scholarship as a new 
phenomenon of scholarly communica- 
tion* A key finding of this report is the 
substantial level of scholarly activity being 
conducted outside the purview of tradi- 
tional archival practice. Network-mediated 
scholarship raises two very different but re- 
lated documentation issues for the archival 
profession. The first is the need to docu- 
ment the origin and administration of re- 
search and education networks themselves. 
The second is the need to document the 



mcnt existing bibliographic information. This ap- 
proach may be useful in developing descriptive systems 
that provide access to information within a collection 
as well. Sec Charles Robb, "Networking Metadata in 
Kentucky," unpublished paper presented to the Na- 
tional Association of Government Archivists and Rec- 
ords Administrators, Chicago, July 1990. 

l74 This recommendation is part of Research Issues 
in Electronic Records, 10-11. Charles Dollar also ar- 
gues that archivists should define metadata elements 
in his report, The Impact of Information Technologies 
on Archival Principles and Methods, 98-100. 
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programmatic use of these networks for the 
advancement of scholarship and learning. 
As an approach and process, documenta- 
tion strategy 175 represents a tool that archi- 
vists can use to address these documentation 
problems— e.g., to identify the key agents 
operating in the network environment, to 
determine the universe of documentation that 
exists, and to develop recommendations for 
preserving documentation of enduring value. 

The large number of agents and the global 
scope of activities associated with research 
and education networks suggests that ar- 
chivists may want to collaborate and seek 
multi-institutional funding for documenta- 
tion projects. At least three types of archi- 
val repositories are well-positioned to initiate 
such projects: (1) college and university ar- 
chives, because network research and ed- 
ucation efforts originate largely in academia; 
(2) government archives, because govern- 
ment is a key partner in most academic- 
based collaborative research projects and 
network-mediated education programs 
(either as a funder, research associate, or 
network administrator); and (3) discipline 
history centers (such as the American In- 
stitute for Physics, the Beckman Center for 
the History of Chemistry, and the Babbage 
Center), as these centers, by definition, ex- 
plore a universe of documentation and are 
heavily devoted to science and technology, 
disciplines in which network-mediated 
scholarship is currently the most pervasive. 

The documentation effort should identify 
key representatives to participate in stra- 
tegic discussions, such as those from the 
Internet and scholarly communities, aca- 
demic computing centers, private industry, 



l75 Two seminal essays that together provide an in- 
tellectual foundation for the concept of documentation 
strategy, as well as an examination of procedures and 
case examples arc: Helen Willa Samuels, "Who Con- 
trols the Past," American Archivist 49 (Spring 1986): 
109-24; and Larry Hackman and Joan Warnow-Blcw. 
cit, "The Documentation Strategy Process: A Model 
and a Case Study," American Archiust 50 (Winter 
198* 7 ): 12-47. 



and government research laboratories. A 
goal of the effort should be to clarify the 
principal records-creating agents and the 
activities that warrant preservation. The 
project report should include a statement on 
the nature of electronic archival records and 
the relationship of these sources to non- 
electronic documentation. 

This recommendation involves a certain 
urgency because existing documentation 
tends to be transient. In fact, compilers of 
several network directories report that at 
least a half dozen recent scholarly elec- 
tronic conferences are already defunct, as 
are more than a dozen electronic newslet- 
ters and journals. 176 Some argue that these 
efforts become inactive when moderators 
switch jobs and no longer possess the 
equipment or time to continue in that role 
or when the interest in a once-timely topic, 
such as the Gulf War, dissipates. Instead 
of papers removed to an attic for storage, 
the records of a defunct electronic confer- 
ence typically take the form of a mass of 
bits abandoned on a campus mainframe 
computer or file server, awaiting a purge 
of the file by a systems administrator in a 
routine cleanup. Given this situation, aca- 
demic computing staff represent key con- 
tacts for campus archivists concerned with 
network files. State archivists also should 
be concerned with the transient nature of 
network communication because network- 
mediated distance education programs are 
under way in most state departments of ed- 
ucation. 177 In summary, archivists at insti- 



l7A Corrcspondcncc via Bitnet on 23 August 1991 
between Avra Michclson and Diane Kovacs, compiler 
of Directory of Scholarly Electronic Conferences; also, 
a list of defunct electronic journals and newsletters 
appears in Michael St range love, Directory of Elec* 
tronic Journals and Newsletters. 

l77 Scc two reports by Barbara Kurshan: Statewide 
Telecommunication Networks: An Overview of the 
Current State and the Growth Potential (Roanoke, 
Va.: Educorp Consultants, December 1990), and with 
Marcia Harrington, Statewide Education Networks: 
Survey Results (Roanoke, Va.: Educorp Consultants, 
April* 1991). Both arc available through Bitnet from 
the author (Kurshan(u vtvml. bitnet). 
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tutions that support online scholarly 
communication are urged to seek funding 
for programs to identify and preserve valu- 
able records related to the administration 
of networks comprising the Internet and 
network-mediated scholarship. 

2 (c): The archival profession should 
support the development of archives de- 
signed to operate on global networks. The 
growth in network-mediated scholarship 
suggests that the archival profession needs 
to define its role in relation to the devel- 
opment of archives designed to operate in 
the global network environment. The need 
for archival operations on research and ed- 
ucation networks is already widely recog- 
nized by the network community. For 
example, program planning in the network 
community involves archival concerns. At 
a biannual meeting of the Coalition for Net- 
worked Information (CNI), many subcom- 
mittees reported on work that entailed the 
resolution of archival functions in a net- 
work environment. 178 Although separate 
from the archival profession, CNI rep- 
resents a group that is identifying issues 
related to the archiving of network re- 
sources. 

Further, the development of electronic 
network archives is already evident. Most 
moderators of scholarly electronic confer- 
ences maintain an archives of the confer- 
ence's transactions accessible via the 
network. 179 Others are capturing subject- 
oriented transactions across research and 
education networks and making the ar- 
chives available on the Internet. lsn Still 



''"Observation by Avra Michclson at the CNI spring 
meeting, 18-20 March 1991, Washington, D.C. 

''''Correspondence by Avra Michclson with Diane 
Kovacs, compiler of the Directory of Scholarly Elct - 
tronic Conferences, on 23 August 1991 via BITNET. 

l *°For instance, Edward Viclmctti at MSEN in Ann 
Arbor, Michigan, collects and makes available de- 
scriptions of network resources publicized on the net- 
works [cmvrMmscn.com]; Nathan Torkington at the 
Computing Services Center in Wellington, New Zca- 



others are exploring commercial models for 
preserving both volume and breadth in net- 
work transactions. 181 Those involved in 
network archiving communicate with one 
another through electronic conferences about 
such issues as data compression algo- 
rithms, information filtering techniques, and 
file transfer protocols. 182 This means that 
seminal models for microarchiving within 
a network environment are already in place, 
while those for archiving on a grander scale 
are either on 'he drawing board or being 
prototyped, each established apart from the 
work of the traditional archival profession. 

Archivists must not underestimate the 
significance of these actions. The future of 
the archival mission in relation to elec- 
tronic communication is being defined by 
a set of agents wholly separate from the 
work of the traditional archival profession. 
Further, the scope of the new archival agents 
is apt to grow as NREN evolves into a piece 
of the backbone used in the conduct of of- 
ficial government business. 183 The appro- 
priate role for the archival profession in this 
arena remains undefined, but the key ques- 
tions are clear. Can the archival profession 
establish the political authority necessary to 
improve the archival methods used in con- 
junction with research and education net- 
work transactions, and can it rise to the 



land, maintains a publicly accessible electronic ar- 
chives of text on information management captured 
from network exchanges [gnat('< kauri. vuw.ac.nz]. 

1H| Viclmctti has developed commercial models for 
archiving select network transactions. 

llC Thc key electronic conference where these issues 
arc discussed is comp. archives. admin modcraied by 
Edward Viclmctti. 

lM fhc U.S. Office of Personnel Management re- 
cently released guidelines already effective for the ac- 
ceptance of electronic signatures in the conduct of 
official government business. With the issue of elec- 
tronic signatures resolved, the use of networks in of- 
ficial government business can be expected to increase 
rapidly. See U.S. Office of Personnel Management, 
Federal Personnel Manual, Chapter 293, Subchapter 
6, Installment 39, 1 April 1991 (Washington, D.C.: 
r.S. Government Printing Office, 
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challenge of defining an archival practice 
suitable not only for electronic records but 
also for a new communication medium? 

Part II: Establishing a Strategy for the 
Future Usability of Electronic Records 

No discussion of information technology 
trends can ignore the issues surrounding the 
storage and use of electronic records them- 
selves. Although this subject has been dis- 
cussed in the archival literature, 184 our focus 
here is on the scholarly research perspec- 
tive. This article has concentrated on the 
near-term effects of information technol- 
ogy on current scholarly practice and prod- 
ucts. It is equally important, however, to 
consider how new ways of producing rec- 
ords (whether they are of scholarly origin 
or not) will affect future users of those rec- 
ords. In particular, how will the creation 
of electronic records affect future scholars 
when they use such records in their re- 
search? What current technology trends bear 
on the ways these future scholars will per- 
form their research and— by implication — 
on the ways future archives will have to 
serve them? 

One of the main advantages of electronic 



lM A selection of the key literature includes David 
Bcarman, Archival Methods, Archives and Museum 
Informatics Technical Report, no. 21 (Spring 1989); 
Advisory Committee for the Co-ordination of Infor- 
mation Systems (ACC1S) Management of Electronic 
Records: Issues and Guidelines (New York: United 
Nations, 1990); U.S. House, Committee on Govern- 
ment Operations, "Taking a Byte out of History: The 
Archival Preservation of Federal Computer Records," 
House Report 101-978 (Washington, D.C.: Govern- 
ment Printing Office, November 1990); Research Is- 
sues in Electronic Records f St. Paul, Minn.: published 
for the National Historical Publications and Records 
Commission, Washington, D.C, by the Minnesota 
Historical Society, 1991); David Bcarman, cd., Ar- 
chival Management cf Electronic Records, A r c hives 
and Museum Informatics Technical Report no. 13 
(Pittsburgh: Archives and Museum Informatics, 1991); 
Margaret Hcdstrom, "Understanding Electronic In- 
cunabula: A Framework for Research on FJcctronic 
Records," American An hivist 54 (Summer 1991): ^14- 
54. 



information is that it is usually digital, which 
ensures that it can be copied and transmit- 
ted without loss or degradation. Yet, iron- 
ically, the preferred media on which this 
digital information is stored— disk, tape, and 
even CD-ROM— have far shorter shelf lives 
than acid-free paper or microfilm. More- 
over, these media tend to become unusable 
long before they reach their ultimate age 
limits. As technology evolves, it quickly 
reaches a point where older media can no 
longer be accessed by existing equipment. 
It is only somewhat facetious to express 
this irony by saying that digital data lasts 
forever— or five years, whichever comes 
first. There is no theoretical problem with 
storing digital information on archival me- 
dia, including microfilm, but such media 
are not in popular use, nor does evidence 
suggest that they will become so. This 
problem has a straightforward, though 
cumbersome and relatively expensive, so- 
lution: to "update" or "migrate" data, that 
is, to copy the data from one medium to 
another as media wear out or become ob- 
solete. Although various technology trends 
(including the continued development of 
optical storage devices such as CD-ROM) 
may improve the longevity of media, the 
overall trend of continued improvement and 
replacement of media implies that the prob- 
lem of obsolescence is unlikely to disap- 
pear in the foreseeable future. 

Despite this problem, it is axiomatic that 
the records produced by governments, or- 
ganizations, individuals, and researchers 
themselves will become increasingly 
"electronic" over the next few decades. 
This implies that scholars of the not-so-dis- 
tant future will be confronted increasingly 
with electronic records as both the primary 
and secondary source materials for their re- 
search. Moreover, the current first gener- 
ation of such records will have unique 
historical significance, representing the most 
drastic change in the form and conception 
i»f records since the introduction of print- 
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ing, or even of writing. 185 Yet at the cur- 
rent rate of technological change, electronic 
documents (and the programs that produce 
and access them) typically become obsolete 
and unusable in a distressingly short time. 
How can the loss of this unique generation 
of records be prevented? How will scholars 
be able to understand and analyze these 
documents decades from now? How can 
archives hope to preserve such documents 
in a form scholars will be able to use? 

Furthermore, media longevity is only a 
part— and in many ways the easier part — 
of the problem. Migrating data can keep 
them "accessible," but to be usable they 
must be more than just accessible: they must 
also be interpretable. The data stored on 
digital media are simply binary digits (bits), 
which cannot be interpreted without a 
translation of the codes they represent and 
an understanding of the structure in which 
they arc placed on their media. Migrating 
data may solve the media longevity prob- 
lem, but by itself it does not solve the larger 
problem. Like an illiterate monk dutifully 
copying text in a lost language, migration 
may save the bits but lose their meaning. 
Even if we assume that the media longevity 
problem can be solved, what technology 
trends bear on whether electronic records 
will be interpretable in the future? 

This issue is often referred to as that of 
software-dependent records, though there 
is somewhat more to the problem than this 
term suggests. Software-dependent records 
are electronic documents that can be read 
only by using some particular piece of 
computer software (that is, a program). Ex- 
amples of software-dependent records in- 
clude documents created with word 
processing or electronic publishing pro- 
grams, spreadsheets, databases, geo- 
graphic information systems (GISs), and 



IM Scc Jay David Bolter, "Text and Technology: 
Reading and Writing in the Electronic Age," library 
Resources ami Technical Services 31 (JanuaryMareh 
1087): 12-23. 



hypertext/hypermedia. Though a data file 
for such a document may be saved on some 
medium (such as a disk), the file can be 
properly interpreted only by its software; 
the document itself is accessible (and in 
some cases may come into existence) only 
by running the software. 186 This can be 
thought of as the problem of "preserving" 
electronic documents. However, in this case, 
'■preservation" means more than simply 
preserving media; unlike printed records, 
electronic records require software and 
hardware in order to be accessed and in- 
terpreted. 

The obvious way to access a software- 
dependent document is to run the software 
that produced it. However, programs them- 
selves quickly become obsolete, and run- 
ning obsolete software is currently very 
difficult. Any given program works only 
on certain computers and only with certain 
system software. This means that accessing 
a document may actually require the user 
to run this entire hardware and software 
environment. In fact, what is typically meant 
when a document is called "software-de- 
pendent" is that it can be accessed only by 
running the entire hardware and software 
environment in which it was created. The 
problem is that such environments become 
obsolete in the blink of an archival eye, and 
maintaining them in working condition be- 
yond that time is a complex, costly, and 
ultimately futile task. 187 Preserving clec- 



IRfl ln a very real sense, all electronic documents arc 
software-dependent. Simple text and numeric files are 
not typically referred to as "software-dependent" only 
because they arc encoded and stored in fairly straight- 
forward ways that currently arc considered obvious 
(e.g.. simple sequences of ASCII codes representing 
characters). Yet even these cannot be accessed or in- 
terpreted without hardware and software that can un- 
derstand their encoding. 

'"Tot several discussions of this issue, sec David 
Bcarman, Collecting Software: A Sew Challenge for 
Archives and Museums, Archives and Museum Infor- 
matics Technical Report no. 2 (Pittsburgh: Archives 
and Museum Informatics, l l >87, reprinted 19 ( )0); and 
Coalition for Networked Information Director Paul Evan 
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ironic documents in a way that will allow 
future access to their form and meaning is 
therefore not straightforward. 

There appear to be two general ap- 
proaches to providing meaningful future 
access to software-dependent documents. 
Either they must be transformed in some 
way that makes them independent of the 
software that created them, or they must be 
saved along with some kind of description 
of their associated software sufficient to al- 
low accessing them as was originally in- 
tended. The first approach might be 
facilitated by the development of standards 
for various kinds of documents, whereas 
the second approach might be facilitated by 
the development of formal models of com- 
putation. Several technology trends bear on 
each of these approaches. 

Software-dependent documents might be 
preserved in a usable form by transforming 
them so that they become "software-inde- 
pendent" in some way. For each recog- 
nized category of program now in use (word 
processing, database, spreadsheet, etc.) a 
standard data file format might be defined, 
along with a standard set of functions that 
any such program can perform. For ex- 
ample, most word processing programs 
provide functions for displaying pages of 
text, footnotes, and chapter headings. In 
principle, a data file for a document from 
any such program could be transformed into 
some standard format, and its behavior could 
be duplicated by some standard pro- 
gram. ,WI This transforation process would 



Peters, "The Machine Aspects of Preservation," un- 
published paper (ca. 1 W)). 

1HM SGML is an attempt to provide a standard for 
this kind of text, though it is generally recognized that 
even a standard for text will not magically remove all 
the incompatibilities among existing word processing 
formats. Another example of this approach that has 
been discussed in the literature involves relational da- 
tabases. The argument has been made that a database 
produced by aw relational database management sys- 
tem (RDBMS) can he transformed into a standard form 
that can be used by any other RDBMS. Sec the Na- 
tional Archives and Records Administration's re- 



have to be repeated periodically as the stan- 
dard itself evolved. Standardization trends 
such as those discussed above may help 
make this possible. However, there may 
always be programs whose behavior cannot 
be duplicated by any standard or which do 
not even fit into the recognized categories 
of programs (e.g., word processing or da- 
tabase). As noted above, standards gener- 
ally lag behind the advancing technology; 
until computer science becomes far better 
formalized (that is, based on firm, theoret- 
ical underpinnings), there will always be 
programs that defy the most well-con- 
ceived efforts at standardization. Policies 
in various organizations may attempt to force 
the use of programs that conform to stan- 
dards, but cunent trends of technological 
innovation make enforcement difficult be- 
cause users find it hard to resist new ca- 
pabilities, whether they are standard or not. 

Even aside from standardization efforts, 
a "natural migration" of documents occurs 
as the programs on which they depend 
evolve through successive versions. New 
versions of programs often provide upward 
compatibility to allow old documents to 
migrate into the required updated forms. It 
may be possible, as has been suggested, 1M ' 
to rely to some extent on this phenomenon 
to keep documents accessible. The effec- 
tiveness of this approach, however, is lim- 
ited by the fact that periodic upheavals occur 
in software paradigms. Two examples of 



sponsc to the recommendations in "Taking a H\te Ou? 
of History." and Kenneth Thibodcau, "To He Or Not 
to Be: Archives for Electronic Records.** in Anhodl 
Management of Electronic Records, edited by DaVid 
Bcarman. 1-13. Although this may be true to a large 
extent, it is a relatively atypical example; relational 
database systems arc one of the very few higher level 
applications for which a formal (mathematical) com- 
putational model exists, Most other common appli- 
cations, such as svord processing, spreadsheets, 
hypertext/hypermedia, or GISs arc not nearly this well 
formalized, 

1 "''Dollar, The Impiict of Information Technologies 
on Anhtval /Vvrnt/'/n itnd Mcthnts, Chapters 1-4. 
draft version. 
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such upheavals are the change from simple 
textual tables to spreadsheets and the change 
from hierarchical databases to relational 
databases. Such upheavals make it difficult 
enough to transform documents that are 
crucial to the daily functioning of organi- 
zations; transforming old documents that 
are no longer in use may require more ef- 
fort than most organizations are willing to 
spare. 

The alternative to transforming software- 
dependent documents into software-inde- 
pendent form is to interpret them by some- 
how using the software that they depend 
on, despite its being obsolete. Interpreta- 
tion does not necessarily require actually 
running the software. If a complete de- 
scription existed of how a program inter- 
prets its data files in accessing a document, 
it would not be necessary to save the soft- 
ware itself (or its environment). The doc- 
ument mild be accessed by following this 
description, effectively recreating the be- 
tumor of the software. In most cases, un- 
fortunately, such complete descriptions of 
software exist only in the form of the soft- 
ware itself. Computer science is not yet very 
good at describing what complex software 
does. 11,0 

Interpreting a software-dependent docu- 
ment by using the software it depends on 
therefore requires either being able to run 
the software that has been saved along with 
the document (by effectively recreating its 
environment), or interpreting the software 
without running it (effectively recreating, 
or emulating, its behavior). The former op- 
tion requires saving vast (though finite) 



'"•There are exceptions to this, such as the rela- 
tional database case discussed above, In general, how- 
ever, current formal descriptive techniques cannot 
capture the "human level" semantic behavior of pro- 
grams. What is required is a computational theory, 
not of how programs work, but of what they do for 
their users; i.e., a theory of human information 
processing that describes such things as how hununs 
create and use documents and how humans interact 
with each other to perform research. 



documentation for the software and its en- 
vironment, including detailed technical de- 
scriptions of any required hardware and all 
of its components. 191 The latter option re- 
quires a more sophisticated computational 
theory than is currently available, i.e., an 
understanding of the semantics of what 
programs do at the human level of infor- 
mation processing and how they do it. 
Without such a theory, it remains impract- 
ical to interpret software except by running 
it in its original hardware and software en- 
vironment. 192 Current trends toward im- 
proving the formal specification of systems 
and environments may facilitate the former 
option, whereas trends toward modeling 
human level computational processes may 
facilitate the latter. 193 Finally, it should be 
noted that the overriding trend toward in- 
creased computational power may enable 
the performance of tasks that now appear 
unthinkable, just as we now routinely per- 
form computations that were unthinkable a 
decade or two ago. Such future tasks might 
include automatically decoding lost file 
structures, transforming obsolete document 
formats through successive generations of 
standards, or recreating the behavior of ar- 



m Although this is a huge task, it may not be in- 
surmountable: These environments could not exist in 
the first place if they did not already posses 1 iucb 
technical descriptions. Furthermore, many of these 
descriptions are already in patent or copyright offices, 
where they might be accessible for this purpose. 

iy2 Rccrcating the behavior of a program by figuring 
out what it was intended to do and building a new 
program that docs what (he original program did is 
sometimes called "reverse engineering." It is widely 
recognized as a difficult task. 

^Advances in computational theory may enable 
future generations of scholars to understand how we 
viewed and manipulated our documents far better than 
we understand it ourselves. The present is, after all, 
only the dawn of the information age, and the organ- 
izing principles of the new "computation" paradigm 
arc only beginning to cmcrg\ Future scholars may 
have a far better formal (i.e., mathematical) under- 
standing of computation and human information 
processing; this would provide them with a theoretical 
framework that could explain any kind of software- 
dependence and allow them to reconstruct past capa- 
bilities at will. 
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chaic computational environments from 
imperfect documentation. These computa- 
tional possibilities may well allow future 
generations of scholars to derive the equiv- 
alent standard form of obsolete software- 
dependent documents in their archives or 
to reproduce the behavior of the software 
that produced them at will. 

In the context of scholarly research and 
information technology, the issue of soft- 
ware-dependent records can be phrased in 
terms of two questions: "How can access 
to software-dependent documents be pro- 
vided to future scholars?'' and "What tech- 
nology can help to provide this access?"' 

To answer the first question, one must 
articulate certain assumptions about what 
kinds of access future scholars are likely to 
need to such documents and what they will 
do with them after they have accessed them. 
The software used to create a software-de- 
pendent document determines the capabil- 
ities available to its author for viewing and 
manipulating it. How accurately must 
scholars be able to reproduce these capa- 
bilities? Is it enough to preserve the content 
of such a document without its form? Is it 
enough to preserve its content and form 
without being able to recreate the way its 
author saw it? 194 These questions require 
making assumptions about the kinds of re- 
search future scholars will perform, which 
can be informed by analyzing trends in 
scholarly practice, as undertaken above. 
Given such assumptions, how would alter- 



''• Margaret Hcdstrom suggests that "The solution 
to preservation of electronic records iics somewhere 
between ihc present appro ;*„h of preserving only data 
values and the need to retain all of the functionality 
of an active records system. There arc tremendous 
advantages to retaining the descriptive, search, re- 
trieval, and manipulation functions of some auto- 
mated systems. The ability to retain more complex 
electronic records and more of the useful functionality 
of automated systems, however, will remain beyond 
tlv control of archivists if they continue to utilize only 
the tactics [that] have been employed in the pa*l." 
"Archives: To Be or Not to Be: A Commentary," in 
Archival Management of Electronic Records, 28. 



native software-dependent records manage- 
ment policies constrain or enhance the 
capabilities of future scholars in perform- 
ing their research using software-dependent 
documents? 195 

To answer the second question, one must 
articulate other assumptions about the tech- 
nological future (while recognizing that all 
such assumptions are speculative). 196 In 
particular, what do current technology trends 
imply about future capabilities for access- 
ing software-dependent records? 

Saving data files for software-dependent 
documents is a necessary but insufficient 
step toward making them usable. As dis- 
cussed above, data can be migrated to new 
media to keep them readable, but data must 
be more than just readable to be usable: 
They must also be interpretable. Is there 
some way to transform such documents be- 
fore saving them in archives, so that they 
can be used without their software? If so, 
what would this sacrifice in terms of being 
able to recreate the author's original capa- 
bilities? Alternatively, is there some prac- 
tical way of saving the software with each 
document (in particular, without maintain- 
ing obsolete hardware/software environ- 
ments) so that the software itself can be 
used in the future to access the document? 
If solutions to these problems are not found 
and implemented soon, much of the first 
generation of electronic documents— rep- 
resenting a unique historical event in the 
evolution of records— will be irretrievably 
lost. 

To summarize, there appear to be two 
general approaches to solving this problem, 
as discussed in the archival literature: 
Transform each document and save it in 
software-independent form, or save the 
software for each document in some way 



|yv rhis article raises this question without attempt- 
ing to answer it. Our point is that the assumptions that 
underlie any answer must be made explicit. 

'"'The archival literature on this subject has not yet 
generally articulated such assumptions. 
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that allows it to be used in the future to 
interpret the document. 

The solutions that have been proposed in 
the literature for both approaches (e.g., 
translating documents into one of a few 
current standard forms or keeping hard- 
ware/software environments running for as 
long as possible) appear to be based on im- 
plicitly conservative assumptions about fu- 
ture technology. It seems likely, however, 
that inevitable advances in computational 
theory and computational power will pro- 
duce a vastly more capable future, enabling 
better, longer-range solutions to one or both 
of these approaches. This analysis, has im- 
plications for the actions that should be taken 
now to ensure the preservation of these rec- 
ords. We see the following recommenda- 
tion as a necessary step toward deciding on 
such actions. 

Recommendation 3: The archival 
profession should establish an evolving 
policy on the management of software- 
dependent records, informed by an 
assessment of the kinds of access future 
scholars will require to such records and 
a realistic assessment of the 
computational capabilities that will be 
available in the future. 

Because of the short effective life of most 
electronic media and the rapidity with which 
software-dependent documents tend to be- 
come obsolete and unusable, this recom- 
mendation has an urgent aspect: Electronic 
records of enduring value that are not ap- 
propriately preserved will soon be lost to 
posterity. 

The archival profession should take steps 
to ensure that its evolving software-depen- 
dent records management policy considers 
the ways that future scholars are likely to 
use these records and the ways that future 
technology is likely to facilitate this use. 
Assessments, such as the one we have un- 
dertaken here, which attempt to analyze 
trends in scholarly practice and information 



technology should be used to attempt to 
project future needs and capabilities that 
are realistic, i.e., neither wishfully gran- 
diose nor un ; maginativeiy chained to the 
past. These projections should be used to 
produce evolving policies aimed at the 
moving target that is the future. 
' Evolving trends in scholarly practice 
should be sought out by the archival profes- 
sion, in an attempt to coordinate the de- 
velopment of archival policies with the 
perceptions and projections of those schol- 
ars who represent the leading edge of change 
in scholarly research practice. This coor- 
dination might be achieved through sched- 
uling paper sessions or panel discussions 
on evolving scholarly practice, to be pre- 
sented at archives and library science con- 
ferences and at conferences in various 
scholarly disciplines. Workshops, journals, 
or network discussions might also be or- 
ganized on this subject, soliciting input from 
scholars while establishing the archival 
profession as a focal point for this inquiry. 

Similarly, archivists should seek out 
evolving trends in technology, with partic- 
ular emphases on formalisms and standards 
for representing various kinds of docu- 
ments ana on formal models of computa- 
tion and human information processing, 
which ultimately may make it possible to 
describe the behavior of software in ways 
that will allow it to be emulated in the fu- 
ture. In this endeavor, archivists should ac- 
tively engage the computer science 
community as a partner, for example by 
organizing sessions or panels on these sub- 
jects at both computer science and archives 
conferences. 

Finally, archivists should engage in an 
ongoing effort to understand the most likely 
future uses of software-dependent records, 
and they should articulate their assump- 
tions about future scholarly practice and fu- 
ture computational capability as a 
prerequisite for proposing archival policies 
on the management of software-dependent 
records. 
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Part III: Recognizing and Rewarding 
Leadership 

Recommendation 4: The archival 
profession should reward activities that 
advance archival practice with 
information technology, electronic 
records, and electronic communication. 

The archival profession must respond to 
the changing patterns of scholarly com- 
munication and the emergence of a new 
communication medium. Leadership ca- 
pable of guiding the archival profession 
should be cultivated by promoting graduate 
education programs, collaborative projects, 
and professional coalitions targeted at ad- 
vancing archival operations in global net- 
work environments. The Society of 
American Archivists ~nd the field's other 
professional associations should recognize 
and reward excellence in research, pilot 
projects, collaborative associations, and 
programmatic implementations related to 
the management of electronic records, the 
use of information technology to improve 
archival practice, and the establishment of 
archival methods suitable to modern com- 
munication mediums. 
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